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RESULT 1 
AAB62210 

ID AAB62210 standard; protein; 2436 AA. 
XX 

AC AAB62210; 
XX 

DT ll-JUN-2001 (first entry) 
XX 

DE Human ABCA2 transporter protein. 
XX 

KW ABCA2; transporter protein; gene therapy; cell transport; human. 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT Domain 22. .40 



FT /note= "transmebrane domain TM 1" 

FT Domain 227. .914 

FT /note= "transmebrane domain TM 6" 

FT Domain 707. .729 

FT /note= "transmebrane domain TM 2" 

FT Domain 750. .772 

FT /note= "transmebrane domain TM 3" 

FT Domain 784. .806 

FT /note= "transmebrane domain TM 4" 

FT Domain 813. .835 

FT /note= "transmebrane domain TM 5" 

FT Region 1007. .1193 

FT /note= "ATP binding cassette" 

FT Domain 1457. .1477 

FT /note= "hydrophic domain HHD" 

FT Domain 1794. .1815 

FT /note= "transmebrane domain TM 7" 

FT Domain 1845. .1867 

FT /note= "transmebrane domain TM 8" 

FT Domain 1876. .1898 

FT '/note= "transmebrane domain TM 9" 

FT Domain 1905. .1927 

FT /note= "transmebrane domain TM 10" 

FT Domain 1946. .1968 

FT /note= "transmebrane domain TM 11" 

FT Domain 1988. .2010 

FT /note= "transmebrane domain TM 12" 

FT Region 2070. .2252 

FT /note= "ATP binding cassette" 
XX 

PN WO200121798-A2. 
XX 

PD 29-MAR-2001. 
XX 

PF 31-AUG-2000; 2000WO-US040789 . 
XX 

PR 20-SEP-1999; 99US-0154 839P . 
XX 

PA . (FOXC-) FOX CHASE CANCER CENT. 
XX 

PI Tew KD, Vulevic B, Chen Z; 
XX 

DR WPI; 2001-257989/26. 

DR N-PSDB; AAF57452. 
XX 

PT New nucleic acid molecule for screening inhibitors of human ABCA2 

PT mediated transport, encoding a human ABCA2 transporter protein with a 

PT multi-domain structure including glycosylation and phosphorylation sites. 

XX 

PS Claim 6; Fig 7; 68pp; English. 
XX 

CC This represents the human ABCA2 transporter protein having a multi- 

CC domain structure including a number of glycosylation and phosphorylation 

CC sites , a lipocalin signature motif , nucleotide binding folds having 

CC walker A and B ATP binding sites, and a number of membrane spanning 

CC helices. Human ABCA2 transporter polypeptides and nucleic acid encoding 

CC them are useful for identification, detection and/or molecular 



CC characterization of components involved in the transport of molecules 

CC across cell membranes. The nucleic acid is useful as a probe to detect 

CC the presence of and/or expression of genes encoding ABCA2 proteins, and 

CC in gene therapy. A host cell comprising the nucleic acid is useful for 

CC screening compounds that inhibit human ABCA2 mediated transport 
XX 

SQ Sequence 2436 AA; 

Query Match 100.0%; Score 12668; DB 4; Length 2436; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 24 36; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 
Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 18 0 

Qy 181 LAARVDPPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 LAARvT)PPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

Qy 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAiW'SQQL 300 

Qy ' 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVXQDVDVTSALALLLPQGACTGRTPGPP 360 

I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVXQDVWLSALALLLPQGACTGRTPGPP 360 

Qy 361 AS GAGGAAN GT GAGAVMG PNAT AE EGAP S AAALAT P DT LQ GQC S AFVQ LWAGLQ P I L C GN 420 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 AS GAGGAAN GT GAGAVMG PN AT AE EGAP SAAALAT P DT LQGQC S AFVQ LWAGLQ P I LCGN 420 

Qy 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

I I 1 1 1 1 1 1 1 1 I I I 1 1 1 1 1 I 1 1 1 1 1 1 I I 1 1 1 1 1 IJ I I 1 1 1 I I I 1 1 1 I I I 1 1 1 1 1 I I I 1 1 1 I 

Db 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

Qy 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

Qy 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

Qy 601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 NVTVFASVI FQTRKDGS LP PHVHYKI RQNS S FT EKTNEI RRAYWRPGPNTGGRFYFL YGF 660 



Qy 



661 WIQDMMERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 



1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 661 WIQDMMERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 

Qy 721 AMT I QH I VAEKEH RLKE VMKTMGLNNAVHWVAWF I T G FVQLS I SVT ALTAI LK YGQVLMH 780 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 721 AMT I QH I VAEKEHRLKEVMKTMGLNN AVHWVAW FI T G FVQL S I SVT ALTAI LK YGQVLMH 780 

Qy 781 S HWI I WL FLAVYAVAT I MFC FLVS VL YS KAKLAS AC GG 1 1 Y FL S YVP YMYVAI RE EVAH 840 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I 
Db 781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 

Qy 841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

Qy . 901 MVDAWYGI LTW Y I EAVH PGM YGL P RP W Y FP LQK S YWLG S GRTEAWEWS W PWART P RL S V 960 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 901 MVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

Qy 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 102 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

Qy 1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATI YGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

Qy 1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

Qy 1141 GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 

Db 1141 GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

Qy 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

Qy 12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

Qy 1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 KVSEEDQSLENSEADVKESRKDVLPG/VEGPASGEGHAGNLARCSELTQSQASLQSASSVG, 138 0 

Qy 1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

Qy 1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

Qy 1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 


1501 


PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 


1560 


Qy 


1561 


SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPS PAPS DS PAS PDEDLQAWNVSLPPTA 


1620 






1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1561 


SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDS PAS PDEDLQAWNVSLPPTA 


1620 


Qy 


1621 


GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRVVTGDILTDITGHNVS 


1680 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 




Db 


1621 


GPE1MWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 


1680 


Qy 


1681 


EYLLFT SDRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSM 


1740 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1. 1 1 1 1 1 1 1 1 II 1 1 




Db 


1681 


EYLL FT SDRFRLHRYGAIT FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSM 


1740 


Qy 


1741 


PT YLNS LNNAI LRANLPKS KGN PAAYGI TVTNHPMNKT S AS LS LDYLLQGTDWI AI FI I 


1800 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1741 


PTYLNSLNNAI LRANLPKS KGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFI I 


1800 


Qy 


1801 


VAMS FVPAS FWFLVAEKS TKAKHLQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPAT CC VI I L F 


I860 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


1801 


VAMS FVPAS FWFLVAEKS T KAKHLQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I L F 


1860 


Qy 

Jr 


1861 


VFDLPAYT S PTNFPAVLS LFLLYGWS I T P I MY PAS FWFEVP S SAYVFLI VINLFI GI TAT 


1920 






1 1 1 1 1 J 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1861 


VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 


1920 


Qy 


1921 


VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 


1980 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


1921 


VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 


1980 


Qy 


1981 


KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 


2040 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | I | | I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1981 


KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 


2040 


Qy 


2041 


VL RGDADN DMVK I EN LT KVYKS RK I GR I LAVDRL C LGVRP GEC FGL LGVNGAGKT S T FKM 


2100 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I 




Db 


2041 


VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 


2100 


Qy 


2101 


LTGDESTTGGEAFVNGHSVLKELLOVOOSLGYCPOCDALFDELTAREHLOLYTRLRGTSW 


2160 






M 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I I | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


2101 


LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 


2160 


Qy 


2161 


KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 


2220 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


2161 


KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 


2220 


Qy 


2221 


ARRFLWNLILDLIKTGRSVVLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 


2280 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I I I I I I | I | | | | | | || | | | | | | | | | | | | | | | | | | | | | | 




Db 


2221 


ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 


2280 


Qy 


2281 


YMITVjRTKSSQSVKDVWFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 


2340 






M M 1 1 II 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 I I I I I I I I I I I 1 1 I I 1 1 1 1 1 1 I 1 I 1 1 1 1 

' 1 ■ 1 1 1 I 1 I I I 1 I I 1 ■ I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 J 




Db 


2281 


YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 


2340 


Qy 


2341 


VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 


2400 






1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I 




Db 


2341 


VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 


2400 



Qy 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db ' 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 



RESULT 2 
ABP52093 

ID ABP52093 standard; protein; 2436 AA. 
XX 

AC ABP52093; 
XX 

DT 10-OCT-2002 (first entry) 
XX 

DE Homo sapiens ABC transporter ABCA2 protein SEQ ID NO: 45. 
XX 

KW ATP-binding cassette transporter; ABC transporter; modulation; D loop; 

KW cancer; bacterial infection; fungal infection; protozoal infection; 

KW antibacterial; fungicide; protozoacide . 
XX 

OS Homo sapiens . 
XX 

PN EP1217066-A1. 
XX 

PD 26-JUN-2002. 
XX 

PF 21-DEC-2000; 2000EP-00870316 . 
XX 

PR 21-DEC-2000; 2000EP-00870316 . 
XX 

PA (UYGE-) UNIV GENT. 
XX 

DR WPI; 2002-550404/59. 
XX 

PT Modulating activity of ATP-binding cassette (ABC) transporters by 

PT influencing dimerization of nucleotide binding domains through use of D 

PT loop sequence of an ABC transporter, or its antisense peptide or peptide 

PT mimetic. 

XX 

PS Disclosure; Fig 3; 290pp; English. 
XX 

CC The present invention describes a method (Ml) for modulating the activity 

CC of ATP-binding cassette (ABC) transporters by influencing the 

CC dimerisation of the nucleotide binding domains comprises using: (a) a 

CC polypeptide (polyP) consisting of 5-50 amino acids comprising the D loop 

CC sequence 1 of an ABC transporter (ABP52049 to ABP52091) ; (b) a polyP 

CC consisting of the D loop sequence of an ABC transporter; (c) a peptide 

CC mimetic or antisense peptide of (a) or (b) . ABC transporters have 

CC antibacterial, fungicide and protozoacide activities. (Ml) is useful for 

CC selectively modulating the activity of ABC transporters belonging to the 

CC group of multidrug transporter/P-glycoproteins . Bacterial, fungal or 

CC protozoal ABC transporters are involved in the infection of a mammal or 

CC in the induction of resistance to antibiotics or drugs in a mammal. (Ml) 

CC is useful for preventing, treating or alleviating diseases associated 

CC with functionality of an ABC transporter. ABP52092 to ABP52140 represent 

CC ABC transporter proteins given in the exemplification of the present 

CC invention 



SQ Sequence 2436 AA; 



Query Match 100.0%; Score 12668; DB 5; Length 2436; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 2436; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA. 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 12 0 

I I I I I I I I I > > I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRVVEEGNLFDPARPSLGSE 120 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I I I I I I I I I I I I I I I M I I I I I I I I Ml I I I I I I I I I I I I I I I I 

Db 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

Qy 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPAiLEQLTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 24 0 

Qy ' 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

Qy 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | M I I I I I I I I I I I I I I I I I I I I I 
Db 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

Qy 361 AS GAGGAAN GT GAGAVMG PNAT AE E GAP S AAALAT P DT LQ GQ C S AFVQ LWAGLQ P I L C GN 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | 
Db 361 AS GAGGAAN GTGAGAVMG PN AT AE EGAP S AAALAT P DT LQ GQ C S AFVQ LWAG LQ P I LC GN 420 

Qy 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

Qy 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

Qy 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

Qy 601 NVTVFASVI FQTRKDGSLP PHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTGGRFYFLYGF 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 NVTVFASVI FQTRKDGSLPPHVHYKI RQNSS FTEKTNEI RRAYWRPGPNTGGRFYFLYGF 660 

Qy 661 VWIQDMMERAIIDTFVGHDVVEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVI SWVYSV 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 VWIQDMMERAI I DTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVI SWVYSV 720 

Qy 721 AMTIQHI VAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLS I SVT ALTAI LKYGQVLMH 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 


Qy 


901 


Db 


901 


Qy 


961 


Db 


961 


Qy 


1021 


Db 


1021 


Qy 


1081 


Db 


1081 


Qy 


1141 


Db 


1141 


Qy 


1201 


Db 


1201 


Qy 


1261 


Db 


1261 


Qy 


1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 



AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSI SVTALTAI LKYGQVLMH 780 

S HWI I WL FLAVYAVAT I MFC FLVS VL YS KAKLAS AC GGI I YFL S YVP YMYVAI REEVAH 84 0 
> I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 
SHWI IWLFLAWAVATIMFCFLVSVLYSKAKLASACGGI I YFL S YVP YMYVAI REEVAH 840 

DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

MVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MVDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 102 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 102 0 

VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 108 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 
VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 
I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 
I I I I I I I I I I I I I I I I I I I I I I I I I I I II 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 12 60 

HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | I I I I I I 
HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 132 0 

KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 14 4 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I 
HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 
I I I I I I I I I I I I I I I I LI I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 



Qy 


1621 


Db 


1621 


Qy 


1681 


Db 


1681 


Qy 


1741 


Db 


1741 


Qy 


1801 


Db 


1801 


Qy 


1861 


Db 


1861 


Qy 


1921 


Db 


1921 


Qy 


1981 


Db 


1981 


Qy 


2041 


Db 


2041 


Qy 


2101 


Db 


2101 


Qy 


2161 


Db 


2161 


Qy 


2221 


Db 


2221 


Qy 


2281 


Db 


2281 


Ov 


2341 


Db 


2341 


Qy 


2401 


Db 


2401 



GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHWS 168 0 

EYLLFT S DRFRLHRYGAI T FGNVLKS I PAS FGT RAP PMVRKI AVRRAAQVFYNNKG YH SM 17 40 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

EYLL FT S DRFRLHRYGAI T FGNVLKS I PAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 

PT YLNSLNNAI LRANLPKSKGNPAAYGI TVTNHPMNKTSASLSLDYLLQGTDWIAI FI I 1800 
I I I I I I I I I I I I I I I III I I I I I I I I I | | | | | | | | | | | | M | | | M I I I I I I I I I I I I I I 
PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 18 00 

VAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 18 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | 
VAMS FVPAS FWFLVAE KS T KAKH LQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPAT C C VI I L F 18 60 

VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | 
VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 19 80 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 198 0 



I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 



VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | 
LTGDESTTGGEAFWGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I M I I I I I I I I II I I I I I I I II II 

KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 222 0 



M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | 



YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I |.| I I I I 
VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 
I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I II 
RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 



RESULT 3 
ABB76715 

ID ABB76715 standard; protein; 2436 AA. 
XX 

AC ABB76715; 
XX 

DT 06-JUN-2002 (first entry) 
XX 

DE Human ATP binding cassette transporter protein, ABCA2 . 
XX 

KW Human; ABCA2; neuroprotective; nootropic; antiparkinsonian; 

KW adenosine triphosphate binding cassette transporter protein; 

KW ATP binding cassette transporter protein; Alzheimer's disease; 

KW prion disease; Huntington's disease; Parkinson's disease. 
XX 

OS Homo sapiens . 
XX 

PN WO200208424-A1. 
XX 

PD 31-JAN-2002. 
XX 

PF 26-JUL-2001; 2001WO- JP006457 . 
XX 

PR 26-JUL-2000; 2000 JP-00225462 . 
XX 

PA (BANY ) BANYU PHARM CO LTD. 

PA (INAG/) INAGAKI N. 

XX 

PI Inagaki N; 
XX 

DR WPI; 2002-179907/23. 

DR N-PSDB; ABL53009. 
XX 

PT Adenosine triphosphate (ATP) binding cassette transporter gene ABCA2 of 

PT human or rat origin and encoded protein, useful for screening inhibitors, 

PT promoters and regulators of ABCA2 activity as drugs and diagnosis of 

PT ABCA2-related diseases. 
XX 

PS Claim 1; Page 52-64; 118pp; Japanese. 
XX 

CC The present sequence is the protein sequence for human adenosine 

CC triphosphate (ATP) binding cassette transporter protein (ABCA2 ) . ABCA2 

CC can be used in the diagnosis, treatment and prevention of diseases such 

CC as Alzheimer's disease, prion diseases, Huntington's disease, and 

CC Parkinson's disease 

XX 

SQ Sequence 2436 AA; 

Query Match 99.9%; Score 12660; DB 5; Length 2436; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 24 35; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MGFLHQLQLLLWK^A^^LKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGFLHQLQLLLWKNVTLKRRS PWVXAFEI FIPLVLFFILLGLRQKKPTISVKEVSFYTAA 60 



61 PLTSAGI LPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 PLTSAGI LPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRVVEEGNLFDPARPSLGSE 120 

121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

181 LAARVDPPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAAR7VRRFSGLSAELRNQLDVAKVSQQL 300 

301 GLDAPNGSDSSPQAPPPRRLQTU.LGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

I I I II I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

361 AS GAGGAANGT GAGAVMG PNATAEEGAP SAAALAT PDT LQGQC S AFVQLWAGLQP I LCGN 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 AS GAGGAAN GT GAGAVMG PN ATAE E GAP SAAALAT P DT LQ GQ C S AFVQ LWAGLQ P I L C GN 420 

421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 4 80 

481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

661 WIQDMMERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISV^SV 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 WIQDMMERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 

721 AMT I QHI VAEKEHRLKEVMKTMGLNNAVHWVAWFI TGFVQLS I SVT ALTAI LKYGQVLMH 780 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 

721 AMT I QH I VAEKEHRLKEVMKTMGLNNAVHWVAWFI TGFVQLS I SVT ALTAI LKYGQVLMH 780 

781 SHVVIIWLFIAVYAVATIMFCFLVSVXYSKAKIASACGGIIYFLSYVPYMYVAI REEVAH 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 

841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

901 MVDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 901 MVDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

Qy 961 MEEDQACT^ESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

Qy 1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

Qy 1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I 
Db 1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

Qy 1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1141 GSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKL 1200 

Qy , 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

Qy 1261 HVASCLLVSDTSTELSYILPSEA7VKKGAFERLFQHLERSLDAXHLSSFGLMDTTLEEVFL 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1261 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

Qy 1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1321 KVS EEDQS LENS EADVKESRKDVL PGAEG PAS GEGHAGNLARC S ELTQ SQAS LQSAS SVG 1380 

Qy 1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

Qy 1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 HGLLVKRFHCARRNSK7VLFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

Qy 1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

I I I I I I I I I I I I II I I I I I I I 1,1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

Qy 1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

Qy 1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

Qy 1681 EYLLFTSDRFRLHRYGAITFGNVLKSIPAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1681 EYLLFTS DRFRLHRYGAI TFGNVLKS I PAS FGTRAPPMVRKI AVRRAAQVFYNNKGYH SM 1740 

Qy 1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 



Qy 1801 VAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 1860 

I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1801 VAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWIJ^YVWDMLNYLVPATCCVIILF 1860 

Qy 1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 18 61 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 192 0 

Qy 1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

I I 1 1 1 1 1 I 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 1 1 I 1 1 1 1 1 I 1 1 1 1 1 I I I I i I I II i I 1 1 1 1 

Db 1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

Qy 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDYDVASERQR 2040 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

Qy 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

Qy 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

Qy 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI7UjI GYPAFI FLDEPTTGMDPK 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 



Qy 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

Qy 2281 YMITVRTKSSQSVKDWRFFNRNFPE7\MLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

Qy 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

Qy 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 



RESULT 4 
AAE22903 

ID AAE22903 standard; protein; 2436 AA. 
XX 

AC AAE22903; 
XX 

DT 09-AUG-2002 (first entry) 
XX 

DE Human transporter and ion channel (TRICH) 2. 
XX 



KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
PR 
XX 
PA 
XX 
PI 
PI 
PI 



Human; transporter and ion channel; TRICH; transport disorder; 
diabetes mellitus; angina; Alzheimer's disease; neurological; epilepsy; 
stroke; Huntington's disease; meningitis; muscle; myocarditis; cancer; 
infectious myositis; arrhythmia; asthma; immunological; gene therapy; 
acquired immunodeficiency syndrome; AIDS; allergy; atherosclerosis; 
cell proliferative disorder; cerebroprotective; cirrhosis; hepatitis; 
transgenic; neuroprotective; anticonvulsant; nootropic; cytostatic- 
antiinflammatory; hepatotropic; psoriasis. 



Homo sapiens. 

Key 
Domain 

Domain 

Domain 

Domain 

Binding-site 
Domain 
Domain 
Domain 
Domain 
Domain 
Domain 
Domain 

Binding-site 

WO200222684-A2. 
21-MAR-2002. 

14-SEP-2001; 2001WO-US028938 . 



Location/ Qualifiers 
22. .45 

/label= Transmembrane_domain 
784. .803 

/label= Transmembrane_domain 
8 93. .911 

/label= Transmembrane_domain 
1018. .1198 

/note= "ABC transporter domain" 
1025. .1032 

/note= "ATP/GTP binding site" 
1124. .1138 

/note= "ABC transporter motif" 
.1437 

'Lipocalin motif" 
.1437 

"Lipocalin motif" 
.1813 

/label= Transmembrane_domain 
1845. .1862 

/label= Transmembrane_domain 
1900. .1926 

/label= Transmembrane_domain 
2081. .2262 

/note= "ABC transporter domain" 
2088. .2095 

/note= "ATP/GTP binding site" 



1424. 
/note= 
1426. . 
/ note= 
1793. 



15-SEP-2000; 
22-SEP-2000; 
29-SEP-2000; 
05-OCT-2000; 
13-OCT-2000; 
18-OCT-2000; 



2000US- 
2000US- 
2000US- 
2000US- 
2000US- 
2000US- 



0232685P. 
0234842P. 
0236882P. 
0239057P. 
0240540P. 
0241700P. 



(INCY-) INCYTE GENOMICS INC. 

Lee EA, Yue H, Lai PG, Walia NK, 
Sanjanwala MS, Yao MG, Ramkumar J, 
Policky JL, Elliott VS, Arvizu C, 



Baughn MR, Warren BA, Lee S; 

Thornton M, Gandhi AR; 
Raumann BE, Bruns CM, Naini A; 



PI Hafalia AJA, Nguyen DB, Xu Y, Lu DAM, Ison CH, Griffin JA; 

PI Reddy RM, Burford N; 

XX 

DR WPI; 2002-393948/42. 

DR N-PSDB; AAD36299. 
XX 

PT Polypeptides of human transporters and ion channels, useful for 

PT diagnosing, treating or preventing transport, neurological, muscle, 

PT immunological and cell proliferative disorders. 

XX 

PS Claim 1; Page 136-142; 204pp; English. 
XX 

CC The invention relates to human transporters and ion channels (TRICH) and 

CC their corresponding nucleic acid sequences. TRICH is useful for screening 

CC an agonist/antagonist that modulates its activity. TRICH is useful as an 

CC immunogen for preparing antibodies which are useful for diagnosing a 

CC condition of disease associated with its expression in a subject, and for 

CC detecting and purifying it from a sample. TRICH DNA is useful as probe or 

CC a primer for assessing toxicity of a test compound. Composition 

CC comprising TRICH or its agonist is useful for treating a disease or 

CC condition associated with decreased expression of functional TRICH and 

CC composition comprising TRICH antagonist is useful for treating a disease 

CC or condition associated with TRICH overexpression of TRICH. TRICH 

CC sequence is used in the diagnosis and treatment of transport disorder 

CC e.g. diabetes mellitus, angina, Alzheimer's disease; neurological 

CC disorder e.g. epilepsy, stroke, Huntington's disease, bacterial and viral 

CC meningitis, muscle disorder e.g. myocarditis, infectious myositis, 

CC arrhythmias, asthma, immunological disorder e.g. acquired 

CC immunodeficiency syndrome (AIDS), allergies, atherosclerosis; and cell 

CC proliferative disorders e.g. cirrhosis, hepatitis, psoriasis and cancers. 

CC TRICH DNA is used in gene therapy. TRICH DNA is useful for creating 

CC knockin humanised animals (pigs) or transgenic animals (mice or rats) to 

CC model human disease. The present sequence is human TRICH protein 

XX 

SQ Sequence 2436 AA; . 

Query Match 99.9%; Score 12660; DB 5; Length 2436; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 2435; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFI PLVLFFILLGLRQKKPTI SVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGFLHQLQLLLWKN VTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVS FYTAA 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l-l I I I 
Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 12 0 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSWSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121' LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

Qy 181 LAARVDPPEVTHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II 
Db 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 24 0 



Qy 



241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELFWQLDVAKVSQQL 300 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 


241 


TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 


Qy 


301 


GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 






1 M M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


301 


GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 


360 


Qy 


361 


AS GAGGAANGT GAGAVMG PNATAEEGAP SAAALAT PDT LQGQCS AFVQLWAGLQP I LCGN 


420 






M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 | 1 | 1 | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 1 M | 




Db 


361 


AS GAGGAANGT GAGAVMG PNATAEEGAP SAAALAT PDT LQGQC SAHVQLWAGLQ P I LCGN 


420 


Qy 


421 


NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLOTSNPKILYAPAGSEVDRVILKANETF 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | M 




Db 


421 


NRTI EPEALRRGNMS SLGFTSKEQRNLGLLVHLMTSNPKI L YAP AGS EVDRVI LKANETF 


480 


Qy 


481 


AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 54 0 






1 1 1 1 1 1 1 1 1 1 I i 1 1 1 1 1 | | | | | | | | | | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | | 1 




Db 


481 


AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYV7VELRLHPEALNLSLDELPPA 540 


Qy 


541 


LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | 




Db 


541 


LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 


600 


Qy 


601 


NVTVFASVI FQTRKDGSLPPHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTGGRFYFLYGF 


660 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I II I | | | | | | | | | | | | 




Db 


601 


NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 


660 


Qy 


661 


WIQD1WERAIIDTFVGHDVVEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 


720 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


661 


WIQDMMERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 72 0 


Qy 


721 


AMT I QHI VAEKEHRLKEVMKTMGLNNAVHWVAWFI TGFVQLS I S VT ALTAI LK YGQVLMH 


780 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


721 


AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVLMH 


780 


Qy 


781 


S HWI I WL FLAVYAVAT IMFC FLVS VL Y S KAKLAS ACGG 1 1 Y FL S YVP YMYVAI RE EVAH 


840 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


781 


SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVP YMYVAI REEVAH 


840 


Qy 


841 


DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 


900 






1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 | 




Db 


841 


DKI TAFEKCI AS LMSTTAFGLGS KYFAL YEVAGVGI QWHT FSQS PVEGDDFNLLLAVTML 


900 


Qy 


901 


MVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 


960 






1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | | | | || | | 




Db 


901 


MVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 


960 


Qy 


961 


MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 


1020 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | | | | | | M 1 1 1 




Db 


961 


MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 


1020 


Qy 


1021 


VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 


1080 






1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I I I | | | | | 




Db 


1021 


VSFLGHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRL 


1080 


Qy 


1081 


TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 


1140 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 



1081 TVEEHLWFYSRLKSMAQEEI RREMDKMI EDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 



Qy 


1141 


Db 


1141 


Qy 


1201 


Db 


1201 


Qy 


1261 


Db 


1261 


Qy 


1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 


Qy 


1621 


Db 


1621 


Qy 


1681 


Db 


1681 


Qy 


1741 


Db 


1741 


Qy 


1801 


Db 


1801 


Qy 


1861 


Db 


1861 


Qy 


1921 


Db 


1921 



1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 12 60 
I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | 
HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

KVS EEDQSLENS EAD VK E S RK D VL P GAE G PAS GE GHAGN LARC S E LTQ S Q AS LQ S AS S VG 138 0 

SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | I I I I I I I I I I I I I I I | | | | 
SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I M | | | | | I II I I I I I I I I I I I I 

HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTV7VLSVPEIGDLPPLVLSPSQYHNYTQ 1500 

PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPT^NGSLGPTLNLS 1560 
I I M I I I I I I I I I I I I I I I I U I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | 
PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n | ' 

SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

G P EMWT SAP S L P RL VRE P VRCT C S AQGT G F S C P S S VGGH P P QMRVVT GD I LT D I T GHN VS 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 



I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | 



PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 18 00 

I M I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | I I I 

PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 18 00 

VAMSFVPAS FVVFLVAEKSTKAKHLQFVSGCNPIIYWLANYWDMLNYLVPATCCV 1860 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I 



VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

VATFLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 



Qy 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

I I I I I N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1981 KS P FEWDI VTRGLVAMAVEGWGFLLT IMCQ YN FLRRPQRMP VSTKPVEDDVDVAS ERQR 2040 

Qy 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

Qy 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

Qy 2161 KDEARAAAKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

I M I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I | | | | | | | | | | | M M 
Db 2161 KDE7VRWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

Qy 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

I I I I I II I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I 
Db 2221 ARRFLWNLILDLIKTGRSVVT.TSHSMEECEALCTRIJVIMVNGRLRCLGSIQHLKNRFGDG 2280 

Qy 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

Qy 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I I I I I I I I I IJ I I I I I I I I I I I I | | | | | I I I I I I I I I I I I I I I I I | | | | | | | I I I I I I I 
Db 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

Qy 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 



RESULT 5 
AAG67160 

ID AAG67160 standard; protein; 2436 AA. 
XX 

AC AAG67160; 
XX 



DT 13-NOV-2001 (first entry) 
XX 

DE Amino acid sequence of a human 17114 transporter polypeptide. 
XX 

KW Human; transporter; 20685; 579; 17114; 23821; 33894; 32613; 

KW vesicular monoamine transporter; neurotransmitter-symporter ; 

KW ABC transporter; sulfate transporter; neurological disorder; 

KW central nervous system disorder; Parkinson's disease; depression; pain; 

KW infectious disease; cell proliferative disorder; cancer; blood disorder; 

KW immune disorder; inflammatory disorder; spleen disorder; lung disorder; 

KW Hodgkin f s disease; Niemann-Pick disease; chronic bronchitis; ischemia; 

KW colon disorder; cirrhosis; uterus disorder; endometrium disorder; 

KW endometrial stromal tumour; brain disorder; T-cell disorder; anemia; 

KW Sjogren syndrome; skin disorder; lupus erythematosus; heart disorder; 

KW haematopoietic stem cell; Alzheimer's disease; myocardial infarction; 

KW blood vessel; Kawasaki syndrome; red cell disorder; thymus disorder; 

KW B-cell disorder; kidney disorder; glomerulonephritis; breast disorder; 



KW testis disorder; thyroid disorder; Graves disease; pancreatitis; 

KW skeletal muscle disorder; tumour; pancreas disorder; 

KW small intestine disorder; celiac sprue. 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT Domain 23. .42 

FT /note= "transmembrane domain" 

FT Domain 54. .71 

FT /note= "transmembrane domain" 

FT Domain 707. .724 

FT /note= "transmembrane domain" 

FT Domain 750. .772 

FT /note= "transmembrane domain" 

FT Domain 783. .806 

FT /note= "transmembrane domain" 

FT Domain* 813. .834 

FT /note= "transmembrane domain" 

FT Domain 893. .914 

FT /note- "transmembrane domain" 

FT Domain 1018. .1198 

FT /note= "ABC transporter domain" 

FT Domain 1457. .1479 

FT /note= "transmembrane domain" 

FT Domain 1793. .1816 

FT /note= "transmembrane domain" 

FT Domain 184 6. .18 62 

FT /note= "transmembrane domain" 

FT Domain 1875. .1898 

FT /note= "transmembrane domain" 

FT Domain 1905. .1929 

FT /note= "transmembrane domain" 

FT Domain 2081. .2262 

FT /note= "ABC transporter domain" 
XX 

PN WO200164875-A2. 
XX 

PD 07-SEP-2001. 
XX 

PF 28-FEB-2001; 2001WO-US006374 . 
XX 

PR 29-FEB-2000; 2000US-0185906P . 
XX 

PA (MILL-) MILLENNIUM PHARM INC. 
XX 

PI Glucksmann MA; 
XX 

DR WPI; 2001-550178/61. 

DR N-PSDB; AAH75187. 
XX 

PT Novel human transporter polypeptides useful for treating and diagnosing 

PT Parkinson's disease, Hodgkin disease, glomerulonephritis, myocardial 

PT infarction, Grave's disease, Alzheimer's disease, anemia, asthma and 

PT tumors. 
XX 

PS Claim 9; Fig 14A-G; 259pp; English. 



CC The present sequence represents a human transporter polypeptide. The 

CC specification describes 20685, 579, 17114, 23821, 33894 or 32613 human 

CC transporter polypeptides. The 20685 transporter is similar to vesicular 

CC monoamine transporters. The 579 transporter is similar to 

CC neurotransmitter-symporters . The 17114 transporter is similar to ABC 

CC transporters. The 32613 transporter is similar to sulfate transporters. 

CC The transporter polypeptides and polynucleotides are useful for treating 

CC and diagnosing neurological and central nervous system disorders (e.g. 

CC Parkinson's disease, depression, pain), infectious disease, cell 

CC proliferative disorders (e.g., cancer), blood disorders, and immune and 

CC inflammatory disorders. They are also useful for treating and diagnosing 

CC disorders involving the spleen (e.g., Hodgkin disease, Niemann-Pick 

CC disease), lung (e.g., chronic bronchitis), colon (cirrhosis), uterus and 

CC endometrium (e.g., endometrial stromal tumours), brain (e.g., ischemia), 

CC T-cells (e.g., Sjogren syndrome), skin (lupus erythematosus), 

CC haematopoietic stem cells (e.g, Alzheimer's disease), heart (e.g., 

CC myocardial infarction), blood vessels (e.g., Kawasaki syndrome), red 

CC cells (e.g., anemias), disorders involving thymus, B-cells, kidney (e.g., 

CC glomerulonephritis)', disorders involving breast, testis, epididymis, 

CC prostate, thyroid (e.g., Graves disease), disorders involving skeletal 

CC muscle (e.g, tumour), pancreas (e.g., pancreatitis), small intestine 

CC (e.g., celiac sprue), disorders related to reduced platelet number and 

CC ovary 

XX 

SQ Sequence 2436 AA; 



Query Match 99.9%; Score 12656; DB 4; Length 2436; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 2434; Conservative 1; Mismatches 1; Indels 0; Gaps 0 



Qy 


1 


MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


1 


MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVSFYTAA 


60 


Qy 


61 


PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 


120 


Qy 


121 


LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSV7VRNPQELWRFLTQNLSLPNSTAQ7VL 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 


180 


Qy 


181 


LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 


240 


Qy 


241 


TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 


300 






1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | | | | | | | 1 1 1 1 1 1 1 II 




Db 


241 


TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 


300 


Qy 


301 


GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLS7VLALLLPQGACTGRTPGPP 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 


360 



Qy 



361 AS GAGGAANGTGAGAVMGPNATAEEGAPSAAALAT PDTLQGQCS AFVQLWAGLQP I LCGN 420 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 


361 


AS GAG GAANGT GAGAVMG PN ATAE EGAP S AAALAT P DT LQ GQ C S AFVQ LWAGLQ P I L C GN 


420 


Qy 


421 


NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 


480 


Qv 


481 


AFVGNVT H YAQVWLN I SAE I RS FL EQGRLQQH L RW LQQ YVAE LRLH P EALN L S L D E L P PA 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 


540 


Qv 


541 


LRODNFSLPSGMALLOOLDTIDNAACGWIOFMSKVSVDT FKGFPDFFSTVNYTLNOAYOn 

■XJ A ^y 1 ' 1 ' *■ J - i *• fc* \~>l lruJJJ x K lViliuT.V-«Vj r ¥ X ^ 1 1 1>J1\ V kJ V VJ X, X I \VJ X X U 111 ill kJ _L_ V 1>» J. X Xll\l V^Z/V X \f u 


600 






1 1 1 1 M 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 ! 1 I 1 I I I I I I I I 




Db 


541 


LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 


600 


Qv 


601 


NVTVFASVI FOTRKDGSLPPHVHYKTRONSSFTF.KTNFTRRAYWRPGPMTnf;RFYFT YHF 

j. i* v x v i. i \yj v .x. x »^ x i >w» xj J. J- i.1 v ii i nx 1 1>| tj >J L X lit lv X IN Xj X, I \ L\S \ X V< x\x7 VJX7 IN X uuAL X X Xj X vJ X 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 | 1 1 1 1 1 1 | 1 | | | | | | | | 




Db 


601 


NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 


660 


Ov 


661 


VWIODMMERAIIDTFVGHDVVEPGSYVOMFPYPCYTRDnFT.FVTFHMMPT PMVT ^WVY^V 

v v» J. ^x/l XI IXiATIX 11/11. V *J1 11/ V VJjCUJ 1 V i 1 X 111^11 1\X^U C XJ J. VI Xj ill 11 1 XT Xjk^l 1 vlOWV X O V 


7 9 0 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


661 


VWIQDMMERAI I DTFVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVI SWVYSV 


720 


Qv 


721 


AMTIOHIVAEKEHRLKEVMKTMGLNNAV^ MH 

■* a A >^ 1 a V Ul 1 i \_i_J J. v J_i V 1 H \ J. 1 1 v_I J_l jV« liik V llll V iWl 111 \U JL V J_l fcj .1, k-J V X /V I I _]_ /\ 1 1 I X\ X VHV^ V \ Mill 


7ft 0 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 




Db 


721 


AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAW 


780 


Ov 


781 


SHVVIIWLFLAVYAVATTMFrFT.VSVT.YSKAKT.ASAr^r^TTYFT ^ Y\/P YMYVA T RF FVAT4 

»_>iiv v x x »» xii xjjri. v x a \ v ill X.11X v^. x uvovxil o l\jr^xvXjYn.O/-vv_«VjVjX XXX xjO I Vt X l v l x V t\A. r\Ili Xj Vrtil 


ft a n 
o *± u 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


781 


SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSWPYMYVAIREEVAH 


840 


Ov 


841 


DKITAFEKCIASLMSTTAFGLGSKYFAT.YFVAGVGTOWHTFSOSPVFf^nnFiNrT T T AVTMT 

-L-' i »■ -J- x Ji-i J-iiw in>j xjx x»_» x x X1.X. \j ±j\j tj u i l x\ 1 1 x x_i v j\VJ v \j x \£ v» 11 X X O v^/ »_) XT V i ■ v t I J x^ X 1 » XjXj XjjtVV X 1 ll_i 


qnn 

_7 KJ \J 






1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


DKITAFEKCIASmSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 


900 


Qv 


901 


IWDAVVYGILTWYTFJVVHPGMYGLPRPWYF <?v 

xiv J— 'J x v v x \_i xiii >i x J_ X-ltx v il x v_n l x vjux IXXVfXXX Uy Iv kJ X Vi XJ\2t iJ OA X 1 ■ n wm XjVV O V V IT VV ji.X\ X XT I \ Xj O V 


Q60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


901 


IWDAVVYGILTWYII^WPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLS 


960 


Qy 


961 


MEEDQACAMESRRFEETRGMEEEPTHLPLVVCVDKLTKWKDDKKLALNKLSLNLYEN 


1020 






1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 




Db 


961 


MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 


1020 


Qv 


1021 


VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHjWLFDRL 


1080 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


1021 


VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 


1080 


Qv 


1081 


TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 


1140 






II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 

> 1 1 1 J 1 1 1 J 1 1 1 1 1 M 1 1 1 1 1 1 M f ( I I I I I 1 1 | | | | | | | | | | | | | | | | | | | | | | | | | | | 




Db 


1081 


TVEEHLWFYSRLKSl^QEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 


1140 


Qy 


1141 


GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 


1200 






1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1141 


GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 


1200 



Qy 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I 

Db 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 12 60 



1261 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 KVS EEDQS LEN SEADVKES RKDVLPG7\EGPAS GEGHAGNLARCS ELTQSQAS LQS AS SVG 1380 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTV7VLSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I 
1441 HGLLVKRFHCAJIRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1621 G P EMWT SAP S L P RLVRE P VRCT C S AQ GT G FS C P S S VGGH P PQMRWT GD I LT D I T GHNVS 168 0 

1681 EYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1681 EYLLFT S DRFRLHRYGAI T FGNVLKS I PAS FGT RAP PMVRKI AVRRAAQVFYNNKG YHSM 1740 

1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 18 00 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
1741 PTYLNSLNNAILR7^NLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 

1801 VAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 1860 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
1801 VAMSFVPASFWFLV/^KSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 1860 

18 61 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

1981 KS P FEWDI VTRGLVAMAVEGWGFLLT IMCQYN FLRRPQRMPVSTKPVEDDVDVAS ERQR 2040 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1981 KSP FEWDI VTRGLVAMAVEGWGFLLT IMCQYN FLRRPQRMPVSTKPVEDDVDVASERQR 2040 

2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 21Q0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 



Qy 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 

Db 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

Qy 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

Qy 2221 ARRFLWNLILDLIKTGRSVVLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

I I 1 1 1 1 1 I 1 1 I 1 1 1 1 I 1 1 I 1 1 I 1 1 1 1 1 1 I 1 1 1 I I I I I I I I I 1 1 i 1 1 1 1 II 1 1 1 I 1 1 I I I I 

Db 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

Qy 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2281 YMI TVRTKS SQSVKDWRFFNRNFP EAMLKERHHT KVQYQLKS EHI S LAQVFS KMEQVS G 2340 

Qy 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

Qy 2401 RALVADEPEDLDTEDEGLI SFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2401 RALVADEPEDLDTEDEGLI SFEEERAQLSFNTDTLC 2436 



RESULT 6 
ABB76716 



ID ABB76716 standard; protein; 2434 AA. 
XX 

AC ABB76716; 
XX 

DT 06-JUN-2002 (first entry) 
XX 

DE Rat ATP binding cassette transporter protein, ABCA2 . 
XX 

KW Rat; ABCA2 ; neuroprotective; nootropic; antiparkinsonian; 

KW adenosine triphosphate binding cassette transporter protein; 

KW ATP binding cassette transporter protein; Alzheimer's disease; 

KW prion disease; Huntington's disease; Parkinson's disease. 



XX 

OS Rattus sp. 
XX 

PN WO200208424-A1. 
XX 

PD 31-JAN-2002. 
XX 

PF 26-JUL-2001; 2001WO- JP006457 . 
XX 

PR 26-JUL-2000; 2000 JP-00225462 . 
XX 

PA (BANY ) BANYU PHARM CO LTD. 

PA ( I NAG/ ) INAGAKI N. 

XX 

PI Inagaki N; 
XX 

DR WPI; 2002-179907/23. 

DR N-PSDB; ABL53011. 



PT Adenosine triphosphate (ATP) binding cassette transporter gene ABCA2 of 

PT human or rat origin and encoded protein, useful for screening inhibitors , 

PT promoters and regulators of ABCA2 activity as drugs and diagnosis of 

PT ABCA2-related diseases. 
XX 

PS Claim 6; Page 87-99; 118pp; Japanese. 
XX 

CC The present sequence is the protein sequence for rat adenosine 

CC triphosphate (ATP) binding cassette transporter protein (ABCA2). ABCA2 

CC can be used in the diagnosis , treatment and prevention of diseases such 

CC as Alzheimer's disease, prion diseases, Huntington 1 s disease, and 

CC Parkinson's disease 

XX 

SQ Sequence 2434 AA; 



Query Match 92.6%; Score 11725; DB 5; Length 2434; 

Best Local Similarity 92.8%; Pred. No. 0; 

Matches 2262; Conservative 49; Mismatches 122; Indels 4; Gaps 4 





i 

X 


M^TTT WOT DJ T T TXTTTNn/'T'T V'DDQ PTaT\7T ft TTFT TTT PT "\/T T^TT T T CI DPlK'K'PTT QA/VJ?"\7 , PITV r PZ\ A 
JXlkjr JjnvijW^^riVViVl^ V 1 J_iJ\r\r\.0 r vw V.Li/\r EjJ. r J. ir ±j V J_i r r J. J_iJ_iLjJ_iKy rvrvr 1 J. O Va.Ej V rr C X lr\c\ 


OU 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI S VKEA- FYTAA 


59 


wy 




PT Af^TT PWD^T nPrirTlPHPTTT'Tr'T nvaM^TWHT T TTPT l"iD\A/"FTrr , 'NTT TrnPZiDPQT PQT? 


i on 
J.Z u 






1 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


60 


PLTSAGILPVMQSLCPDGQRDEFGFLQY7\NSTVTQLLERLNRVVEESNLFDPERPSLGSE 


119 


Ov 


121 


T.F.AT.ROHT.FAT.S A^PnTSnSHT.nRSTVSSFST.nSVARNPOFT.WRFT.TOMT.ST.PM^TAnAT. 


-LOU 






1 1 1 1 1 11111:1111 II 1 1 1 1 1 1 1 1 1 1 1 1 : : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


120 


LEALHQRLEALSSGPGTWESHSARPAVSSFSLDSVARDKRELWRFLMQNLSLPNSTAQAL 


179 


Qy 


181 


l^ARVDPPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLIxAPALLEQLTC 


240 


Db 


180 


1 1 1 1 1 1 1 III 1 1 1 1 1 II : 1 : 1 II 1 1 II 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

L7^VDPSEVYRLLFGPLPDLDGKLGFLRKQEPWSHLGSNPLFQMEELLLAPALLEQLTC 


239 


Qy 


241 


TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 


300 






11111111111:11 : 1 1 1 1 1 1 1 1 1 1 1 1 1 III: II 1 : 1 1 1 1 1 1 1 : 1 1 : : 1 1 1 




Db 


240 


APGSGELGRILTMPEGHQVDLQGYRDAVCSGQATARAQHFSDLATELRNQLDIAKIAQQL 


299 


Qy 


301 


GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 


360 






1 : 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


300 


GFNVPNGSDPQPQAPSPQSLQALLGDLLDVQKVXQDVDvXSALALLLPQGACAGRAPAPQ 


359 


Qy 


361 


AS GAGGAAN GT GAG AVMG PN ATAEE GAP S AAALAT P DT LQ GQ C S AFVQ LWAGLQ P I L C GN 


420 






1 1 II 1 1 II 1 II 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 




Db 


360 


AGSPSGPANSTGVGANTGPNTTVEEGTQSPVTPASPDTLQGQCSAFVQLWAGLQPILCGN 


419 


Qy 


421 


NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


420 


NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVliLMTSNPKILYAPAGSEADHVILKANETF 


479 


Qy 


481 


AFVGNVTHYAQVWLNI SAEI RSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 


540 






II 1 1 1 1 1 1 1 II 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 II 




Db 


480 


AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLHWLQQYVADLRLHPEAMNLSLDELPPA 


539 


Qy 


541 


LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 


600 



M I I I I h I I I II I I I II I II I I I I I I I I I I I I M | | I | | | | | | | I I I I I I I I I I I I 



Db 


540 


LRLDYFSLPNGTALLQQLDT I DNAACGWIQFMS KVSVDI FKGFPDEES I VNYTLNQAYQD 


599 


Qy 


601 


NVTVFASVIFQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 


660 






1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


600 


NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 


659 


Qy 


661 


VWIQDMMERAI I DTFVGHDWEPGS YVQMFP YPCYTRDDFLFVI EHMMPLCMVI SWVYSV 


720 






M 1 1 1 1 : 1 1 1 M : M 1 1 1 1 1 1 II 1 : 1 II 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


660 


WIQDMIERAI INT FVGHDWEPGNYVOMFP YPCYTRDDFLFVI 


719 


Qy 


721 


AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMH 


780 






1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 If ! II 1 1 1 1 I 1 1 1 1 1 1 1 




Db 


720 


AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVm^ 


779 


Qy 


781 


SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 


840 






IM:IMIIIIM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


780 


SHVLIIWLFI^WAVATIMFCFLVSVLYSKAKLASACGGIIYFLSWPYMWAIREEVAH 


839 


Qy 


841 


DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 


900 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 I I I I I I I I I 




Db 


840 


DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 


899 


Qy 


901 


MVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 


960 






Ml 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 




Db 


900 


MVDTVVYGVLTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTETWEWSWPWAHAPRLSV 


959 


Qy 


961 


MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 


1020 






1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 : II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


960 


MEEDQACAMESRHFEETRGMEEEPTHLPLVVCVDKLTKVYKNDKKL7VLNKLSLNLYENQV 


1019 


Qy 


1021 


VSFLGHNGAGKTTTMSILTGLFPPTSGSAT1YGHDIRTEMDEIRKNLGMCPQHNVLFDRL 


1080 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 




Db 


1020 


VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDQL 


107 9 


Qy 


1081 


TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 


1140 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


1080 


TVEEHLWFYSRLKSMAQEEIRKEMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 


1139 


Qy 


1141 


GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKL 


1200 






1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 




Db 


1140 


GSRAI ILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAI I SHGKL 


1199 


Qy 


1201 


KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 


1260 






1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1111:1111 II 1 1 : 1 1 1 : 1 1 1 1 1 II 1 




Db 


1200 


KCCGSPLFLKGAYGDGYRLTLVKRPAEPGTSQEPGMASSPSGRPQLSNCSEMQVSQFIRK 


1259 


Qy 


1261 


HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 


1320 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1260 


HVASSLLVSDTSTELSYILPSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFL 


1319 


Qy 


1321 


KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNL7VRCSELTQSQASLQSASSVG 


1380 






11111111111111111 III 1 1 1 1 1 1 1 1 : 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


1320 


KVSEEDQSLENSEADVKESRKDALPGAEGLTAVESQAGNLT^RCSELAQSQASLQSASSVG 


1379 


Qy 


1381 


SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 


1440 



I I I I I I I I I I I I I I I I I II I I I lllhlllll I I I I : I I I I I I I I I : I I I I : I I I 



Db 


1380 


SARGDEGAGYTDGYGDYRPLFDNLQDPDSVS LQEAEMEALARVGQGSRKLEGWWLKMRQF 1439 


Qy 


1441 


HGLLVKRFHCARRNSKALFSQILLPAFFVCV7\MTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I I I 1 1 1 II 




Db 


1440 


HGLLVKRFHCARRNSKALCSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1499 


Qy 


1501 


PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 


1560 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


1500 


PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPMLNLS 


1559 


Qy 


1561 


SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDED-LQAWNVSLPPT 


1619 






1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M Mill 




Db 


1560 


SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPLSPDEDSLLAWNTSLPPT 


1619 


Qy 


1620 


AGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNV 1679 






M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


1620 


AGPETWTWAPSLPRLVHEPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNV 1679 


Qv 


1680 


SEYLLFTSDRFRLHRYGAITFGNVLKSIPAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHS 


1739 






MIIIIIMIIIIIIIIIIIIII: Mill Ml 1 MINIMI Ml MINIM 




Db 


1680 


SEYLLFTSDRFRLHRYGAITFGNIQKSIPAPIGTRTPLMVRKIAVRRVAQVLYNNKGYHS 


1739 


Qv 


1740 


MPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFI 


1799 






1 1 1 II 1 1 III 1 II 1 1 1 1 II 1 II 1 M II II 1 1 1 1 II 1 1 M 1 1 II 1 1 II 1 1 II II II 1 1 1 1 1 




Db 


1740 


MPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFI 


1799 


Qy 


1800 


IVAMSFVPASFWFLVTVEKSTKAKHLQFVSGCNPIIYWI^YVWDMLNYLVPATCCVIIL 


1859 






1 1 1 II M M M 1 1 II 1 1 1 1 II 1 1 1 1 1 1 II II 1 1 1 : II II II II 1 1 1 1 M 1 MM 




Db 


1800 


IVAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPVIYWLANYVWDMLNYLVPATCCIIIL 


1859 


Qy 


1860 


FVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITA 


1919 






1 M 1 II 1 1 1 1 1 II 1 1 M 1 1 1 1 1 II 1 II 1 1 II 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


1860 FVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITA 


1919 


Qy 


1920 


TVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDK 


1979 






M 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 II 1 II 1 II II 1 1 II II 1 1 II : 1 II 1 1 1 1 1 II 1 M 1 II 1 1 




Db 


1920 


TVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEIAYNEYINEYYAKIGQFDK 


1979 


Qy 


1980 


MKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQ 


2039 






N N 1 N || | | | Ml || | | | || || || : | | | : | | | | || | || || | | | || || 




Db 


1980 


MKSPFEWDIVTRGLVAMTVEGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQ 


2039 


Qy 


2040 


RVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFK 


2099 






1 1 1 1 1 1 1 1 M 1 1 1 M 1 M 1 II 1 1 1 1 II II II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 II II 1 1 II 1 1 




Db 


2040 


RVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFK 


2099 


Qy 


2100 


MLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELT7VREHLQLYTRLRGIS 
1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 : 1 1 1 M 1 II 1 II II II II 1 1 II II 1 


2159 


Db 


2100 


MLTGDESTTGGEAFVNGHSVLKDLLQVQQS LGYCPQFDALFDELTAREHLQLYTRLRGI P 


2159 


Qy 


2160 


WKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALI GYPAFI FLDEPTTGMDP 


2219 






1 1 1 M : 1 1 : 1 1 1 III 1 1 1 1 M 1 II 1 M 1 II II II II 1 II 1 II 1 II II II 1 II 1 II 1 1 II 




Db 


2160 


WKDEAQWRWALEKLELTKCADKPAGSYSGGNKRKLSTAIALI GYPAFI FLDEPTTGMDP 


2219 


Qy 


2220 


KARRFLWNLI LDLI KTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGS TQHLKNRFGD 


2279 






M M 1 II 1 II II II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 : 1 1 1 II M II II II 1 II 1 1 1 1 M 1 1 1 1 1 




Db 


2220 


KARRFLWNLI LDLI KTGRSVVLTSHSMEECEAVCTRIiAIMVNGRLRCLGS IQHLKNRFGD 


2279 



i 



Qy 


2280 


GYMITVRTKSSQSVKDWRFFNRNFPE7\MLKERHHTKVQYQLKSEHISLAQVFSKMEQVS 


2339 






1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 |.| 1 1 1 1 1 1 1 




Db 


2280 


GYMI TVRTKS SQNVKDWRFFNRN FP EAMLKERHHT KVQYQLKSEH I S LAQVFS KMEHW 2339 


Qy 


2340 


GVLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTE 


2399 






M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 : 1 1 1 1 1 II 1 MM 1 II II 1 II 1 1 1 1 




Db 


2340 


GVLGIEDYSVSQTTLDNVFVNFAKKQSDNVEQQEAE-PSTLPSPLG-LLSLLRPRPAPTE 


2397 


Qy 


2400 


LRALVADEPEDLDTEDEGLI S FEEERAQLS FNTDTLC 2436 








1 1 1 1 1 1 II 1 II 1 1 II II M II II 1 1 1 II II 1 1 1 M 1 1 




Db 


2398 


LRALVADEPEDLDTEDEGLI S FEEERAQLS FNTDTLC 24 34 





RESULT 7 
AAY72649 



ID AAY72649 standard; protein; 2001 AA. 
XX 

AC AAY7264 9; 
XX 

DT 31-MAY-2001 (first entry) 
XX 

DE Human ATP binding cassette2 (ABC2) transporter protein. 
XX 

KW Human; adenosine triphosphate; ATP; ATP binding cassette2 transporter; 

KW ABC2 transporter; nootropic; neuroprotective; anticonvulsant; neurotoxic; 

KW beta-amyloid; multidrug resistance; therapy; Alzheimer's disease; 

KW prion disease; Parkinson's disease; Huntington's disease; panic disorder; 

KW cholesterol misregulation; inflammatory disease; blood brain barrier; 

KW cancer; mood disorder. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Misc-dif ference 30 

FT /label= Unknown 

FT /note= "Encoded by GSG" 

FT Misc-dif ference 70 

FT /label= Unknown 

FT /note= "Encoded by TYC" 

FT Domain 274. .296 

FT /label= TMH 

FT /note= "Transmembrane helix" 

FT Misc-difference 280 

FT /label= Unknown 

FT /note= "Encoded by GYG" 

FT Domain 317. .339 

FT /label= TMH 

FT /note= "Transmembrane helix" 

FT Domain 351. .373 

FT /label- TMH 

FT /note= "Transmembrane helix" 

FT Domain 380. .398 

FT /label- TMH 

FT /note= "Transmembrane helix" 

FT Domain 411. .428 

FT /label= TMH 



FT /note= "Transmembrane helix" 

FT . Domain 457. .479 

FT /labels TMH 

FT /note= "Transmembrane helix" 

FT Misc-dif ference 477 

FT /label= Unknown 

FT /note- "Encoded by MCG" 

FT Misc-dif ference 558 

FT /label- Unknown 

FT /note= "Encoded by TKC" 

FT Domain 588. .600 

FT /label- Walker_A 

FT Region 689. .695 

FT /label- ABC_signature 

FT Domain 696. .716 

FT /label- Walker_B 

FT Domain 1022. .1044 

FT /label- TMH 

FT /note- "Transmembrane helix" 

FT Domain 1358. .1380 

FT /label- TMH 

FT . /note- "Transmembrane helix" ' 

FT Domain 1410. .1432 

FT /label- TMH 

FT /note- "Transmembrane helix" 

FT Domain 1441. .14 63 

FT /label- TMH 

FT /note- "Transmembrane helix" 

FT Domain 1470. .1492 

FT /label- TMH 

FT /note- "Transmembrane helix" 

FT Misc-dif ference 1471 

FT /label- Unknown 

FT /note- "Encoded by GKG" 

FT Domain 1553. .1575 

FT /label- TMH 

FT /note- "Transmembrane helix" 

FT Domain 1650. .1662 

FT /label- Walker_A 

FT Misc-dif ference 1651 

FT /label- Unknown 

FT /note- "Encoded by CYC" 

FT Misc-dif ference 1689 

FT /label- Unknown 

FT /note- "Encoded by CHC" 

FT Misc-dif ference 1720 

FT /label- Unknown 

FT /nore= Encoded by CTN 

FT Misc-dif ference 1724 

FT /label- Unknown 

FT /note- "Encoded by YCC" 

FT Region 1751. .1758 

FT /label- ABC_signature 

FT Domain 1759. .1780 

FT /label- Walker_B 

XX 

PN WO200114414-A2. 



PD 01-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO-CA000962 . 
XX 

PR 20-AUG-1999; 99US-0150073P . 

PR 30-AUG-1999; 99US-0151457P . 

PR 17-AUG-2000; 2000US-00641040 . 
XX 

PA (ACTI-) ACTIVEPASS PHARM INC. 
XX 



PI Le Bihan 'S, Wilson C, Charest DL; 
XX 

DR WPI; 2001-202931/20. 

DR N-PSDB; AAD02722. 
XX 

PT Novel adenosine triphosphate (ATP) binding cassette transporter protein 

PT 2, useful as target for developing modulators that modulate activity of 

PT transporter protein and thus treat Alzheimer's disease and Parkinson's 

PT disease. 

XX ; 

PS Claim 13; Fig 2; 92pp; English. 

XX 

CC The present sequence is human adenosine triphosphate (ATP) binding 

CC cassette2 (ABC2) transporter protein. ABC2 transporter molecules are 

CC transmembrane proteins which catalyse ATP-dependent transport of 

CC endogenous or exogenous substrates across the biological membranes. ABC2 

CC transporters have been associated with the transport of neurotoxic 

CC polypeptides (e.g., beta-amyloid) and substrates across the blood-brain- 

CC barrier. ABC2 sequence is useful as target for developing modulators that 

CC are useful for modulating amyloid deposition and thus for treating 

CC Alzheimer's disease, prion diseases, Parkinson's disease and Huntington's 

CC disease. It is also useful as targets for developing modulating agents of 

CC multidrug resistance exhibited by e.g., cancer cells. ABC transporters 

CC are also useful for treating mood and panic disorders, cholesterol 

CC misregulation and inflammatory diseases. It can also be used to treat 

CC disorders characterised by insufficient or excessive production of an 

CC ABC2 transporter protein or its inhibitors. Fragments of ABC transporters 

CC are used as immunogens for producing antibodies 

XX 

SQ Sequence 2001 AA; 



Query Match 80.9%; Score 10249; DB 4; Length 2001; 

Best Local Similarity 98.5%; Pred. No. 0; 

Matches 1973; Conservative 2; Mismatches 26; Indels 2; Gaps 2 

Qy 434 MS S LGFTSKEQRNLGLLVHLMT SN PKI LYAPAGS EVDRVI LKANET FAFVGNVTH YAQVW 4 93 

N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 1 MS SLG FT SKEQRNLGLLVHLMTSN PKI LYXPAGS EVDRVI LKANET FAFVGNVTH YAQW 60 

Qy 494 LNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFSLPSGMA 553 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 LNISAEIRSXLEQGRLQQHLRWLQQYVAELRPHPEALNLSLDELPPALRQDNFSLPSGMA 120 

Qy 554 LLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFASVI FQTR 613 

I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 LLQQLDTIDNAPCGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAGVI FQTR 180 



614 KDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGFVWIQDMMERAIID 673 

M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 KDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGFVWIQDMMERAIID 240 

674 TFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAMTIQHIVAEKEH 733 

1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

241 TFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMXISWVYSVAMTIQHIVAEKEH 300 

734 RLKEVMKTMGLNNAVHWAWFITGFVQLSISWALTAILKYGQVLlffiSHWIIWLFLAVY 793 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 RLKEVMKTMGLNNAVHWAWFITGFVQLSISWALTAILKYGQVIJyiHSHWIIWLFLAVY 360 

794 AVAT I M FC FL VS VL Y S KAK LAS AC G G 1 1 Y FL S YVP YM YVAI RE E VAH D K I T AF E K C I AS L 853 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 1 I I I II I I I I I I I I I I I I I I 
361 AVATIMFCFLVSVLYSKAKLASA-GGIIYFLSWPYMYVAIREEVAHDKITAFEKCIASL 419 

854 MSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLIAWMLMVDAVVYGILTWY 913 

I I I I I I I I I I I I I I I I I I I I I I N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

420 MSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLMVDAWYGILXWY 479 

914 IEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRR 973 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
480 IEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRR 539 

974 FEETRGMEEEPTHLPLVVCVDKLTKVYKDDKKLALNKLSLNLYENQVVS FLGHNGAGKTT 1033 

I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I 
54 0 FEETRGMEEEPTHLPLVVXVDKLTKVYKDDKKLALNKLSLNLYENQGVS FLGHNGAGKTT 599 

1034 TMS I LTGLFP PT S GS ATI YGHDI RTEMDEI RKNLGMC PQHNVLFDRLTVEEHLWFYS RLK 1093 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
600 TMS I LTGLFP PTSGSATIYGHDI RTEMDEI RKN-GHVPQHNVLFDRLTVEEHLWFYS RLK 658 

1094 SMAQEEI RREMDKMI EDLELSNKRHSLVQTLSGGMKRKLS VAI AFVGGS RAI I LDEPTAG 1153 

I I I I I I I MINIMUM I I I I I I I I I I M : I I I I I I I I I I I I I I I I I I I I I 

659 SMAQEEI PREMDKMIEDLELSNKRHSLVQTLSGGMKRKVSVAI AFVGGS RAI I LDEPTAG 718 

1154 VDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTY 1213 
I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
719 VDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTY 77 8 

1214 GDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTST 1273 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
779 GDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTST 838 

1274 ELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSE 1333 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

839 ELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSGGDQSLENSG 898 

J 

1334 ADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDV 1393 
I I I I II I I I I I I I I I I I I II I I I I I I I I I I II I I || | | | | | | M I I II I I I I I I I I I I 
899 ADVKESRKDVLPGAEGHASGEGHAGNLARCSELTQSQASLQSASSVGSALGDEGAGYTDV 958 

1394 YGD YRPLFDN PQDPDNVS LQEVEAEALS RVGQGS RKLDGGWLKVRQFHGLLVKRFHCARR 1453 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
959 YGDYPPLFDN PQDPDNVS LQEVEAEALS RVGQGS RKLDGGWLKVRQFHGLLVKRFHCARR 1018 



1454 NSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEER 1513 

I I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1019 NSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEER 1078 

i 

1514 REYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPTLNLS S GESRLLAARFFD 1573 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1079 REYRLRLSPDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPTLNLS SGESRLLAARFFD 1138 

1574 SMCLESFTQGLPLSNFVPPPPS PAPS DS PAS PDEDLQAWNVSLPPTAGPEMWTSAPSLPR 1633 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

1139 SMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGQEMWTSAPSLPR 1198 

1634 LVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH 1693 

I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1199 LVREPVRCTCSAQGTGFSCPNSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH 1258 

1694 RYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNSLNNAILR 1753 

' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1259 RYGAITFGNVLKSIPASFGTRAPPMVRKIRCARAAQVFYNNKGYHSMPTYLNSLNNAILR 1318 

1754 ANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSFVPASFWF 1813 

I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
1319 ANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSFVPASFWF 1378 

1814 LVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNF 1873 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1379 LVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNF 1438 

1874 PAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDK 1933 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1439 PAVLSLFLLYGWSITPIMYPASFWFEVPSSAYXFLIVINLFIGITATVATFLLQLFEHDK 1498 

1934 DLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGL 1993 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
1499 DLKWNSYLKSCFLI FPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGL 1558 

1994 VAMAVEG WG FL LT I MCQ YN FL RRP Q RMP VST K P VE DDVDVAS ERQ RVL RG DADN DMVK I 2053 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1559 VAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMVKI 1618 

2054 ENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAF 2113 

I I I I I I I I I I I I I I I I I I I I I I I I I I I N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1619 ENLTKVYKSRKIGRILAVDRLCLGVRPGECFGXLGVNGAGKTSTFKMLTGDESTTGGEAF 1678 

2114 VNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEK 2173 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1679 WGHSVLKELXQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGIXWKDEARVVKWALEK 1738 

2174 LELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLI 2233 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1739 LELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLI 1798 

2234 KTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSV 2293 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I 

1799 KTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSV 1858 

2294 KDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTT 2353 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1859 KDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTT 1918 

Qy 2354 LDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDT 2413 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1919 LDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDT 1978 

Qy 2414 EDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I 
Db 1979 EDEGLISFEEERAQLSFNTDTLC 2001 



RESULT 8 
ABB98347 

ID ABB98347 standard; protein; 2001 AA. 
XX 

AC ABB98347; 
XX 

DT 29-JAN-2003 (first entry) 
XX 

DE Human ABC transporter ABCA2 SEQ ID NO 8 . 
XX 



KW Human; ABC transporter; ABCB9; ABCB1; ABCA2 ; ABCG4; ABCG1; 

KW amyloid precursor protein; adenosine tri-phosphate; nootropic; 

KW ATP-binding cassette transporter; beta-amyloid plaque formation; 

KW Alzheimer's disease; Parkinson's disease; Huntington's disease; 

KW gene therapy; transgenic; neuroprotective; anticonvulsant; 

KW antiparkinsonian. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Misc-dif ference 30 

FT /note= "Encoded by GCG" 

FT Misc-dif ference 70 

FT /note= "Encoded by TTC" 

FT Misc-dif ference 92 

FT /note- "Encoded by CTG" 

FT Misc-dif ference 132 

FT /note= "Encoded by GCC" 

FT Misc-dif ference 174 

FT /note= "Encoded by AGT" 

FT Misc-dif ference 280 

FT /note= "Encoded by GTG" . 

FT Misc-dif ference 383. .384 

FT /note- "Encoded by GCCTGCGGT" 

FT Misc-dif ference 477 

FT /note- "Encoded by ACG" 

FT Misc-dif ference 558 

FT /note- "Encoded by TGC" 

FT Misc-dif ference 586 

FT /note= "Encoded by GTG" 

FT Misc-dif ference 632. .635 

FT /note- "Encoded by AACCT GGGCAT GT GC " 

FT Misc-dif ference 666 

FT . /note= "Encoded by CGC" 

FT Misc-dif ference 697 



FT /note- "Encoded by CTG" 

FT Misc-dif ference 889 

FT /note- "Encoded by GAG" 

FT Misc-dif ference 890 

FT /note- "Encoded by GAG" 

FT Misc-dif ference 898 

FT /note- "Encoded by GAG" 

FT Misc-dif ference 915 

FT /note- "Encoded by CCG" 

FT Misc-dif ference 948 

FT /note- "Encoded by CGT" 

FT Misc-dif ference 963 

FT /note- "Encoded by CGC" 

FT Misc-dif ference 1187 

FT /note- "Encoded by CCA" 

FT Misc-difference 1219 

FT /note- "Encoded by AGC" 

FT Misc-difference 1288 

FT /note- "Encoded by GCG" 

FT Misc-difference 1289 

FT /note- "Encoded by GTG" 

FT Misc-difference 1290 

FT /note- "Encoded by CGC" 

FT Misc-difference 1471 

FT /note- "Encoded by GTG" 

FT Misc-difference 1651 

FT /note- "Encoded by CTC" 

FT Misc-difference 1689 

FT /note- "Encoded by CTC" 

FT Misc-difference 1724 

FT /note- "Encoded by TCC" 
XX 

PN WO200264781-A2. 
XX 

PD 22-AUG-2002. 
XX 

PF 08-FEB-2002; 2002WO-CA000138 . 
XX 

PR 09-FEB-2001; 2001US-0267975P . 

PR 31-JUL-2001; 2001US-0309256P . 
XX 

PA (ACTI-) ACTIVE PASS PHARM INC. 
XX 

PI Reiner PB, Connop BP, Pollard M; 
XX 

DR WPI; 2002-667006/71. 

DR N-PSDB; ABV74 350. 
XX 

PT Regulating expression of amyloid precursor protein in a cell, useful in 

PT preventing or treating neurological disease, e.g. Alzheimer's disease, 

PT comprises regulating the expression or activity of an ATP-binding 

PT cassette transporter. 
XX 

PS Disclosure; Page; 78pp + Sequence Listing; English. 
XX 

CC The invention relates to regulating (Ml) expression of amyloid precursor 

CC protein in a cell, comprising regulating the expression or activity of an 



CC adenosine tri-phosphate (ATP ) -binding cassette (ABC) transporter in the 

CC cell, (Ml) is useful for regulating expression of amyloid precursor 

CC protein in a brain cell to prevent or inhibit pathological beta-amyloid 

CC plaque formation in conditions such as Alzheimer's disease, Parkinson's 

CC disease or Huntington's disease. (Ml) is also useful in screening assays, 

CC predictive medicine (e.g. diagnostic assays, prognostic assays, 

CC monitoring clinical trials or phamacogenetics ) or methods of treatment 

CC (e.g. therapeutic, prophylactic, gene therapy). The transgenic animals 

CC are useful for testing methods and agents as candidates for modulating or 

CC altering the ABC transporter-relates expression of amyloid precursor 

CC protein. The present sequence is that of an ABC transporter protein 

CC encoding polynucleotide of the invention. Note: The sequence data for 

CC this patent is not represented in the printed specification but is based 

CC on sequence information supplied to Derwent by the European Patent Office 

XX 

SQ Sequence 2001 AA; 



Query Match 80.9%; Score 10249; DB 5; Length 2001; 

Best Local Similarity 98.5%; Pred. No. 0; 

Matches 1973; Conservative 2; Mismatches 26; Indels 2; Gaps 2; 

Qy 434 MS S L G FT S KEQ RNL GL LVH LMT SN P KI L YAP AG S EVD RVI L KAN ET FAFVGNVT H YAQVW 493 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | 
Db 1 MS S LGFTS KEQRNLGLLVHLMT SN PKI LYXPAGS EVDRVI LKANET FAFVGNVT H YAQVW 60 

Qy 4 94 LNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFSLPSGMA 553 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

Db 61 LNISAEIRSXLEQGRLQQHLRWLQQYVAELRPHPEALNLSLDELPPALRQDNFSLPSGMA 120 

Qy 554 LLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFASVIFQTR 613 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I MINI 
Db 121 LLQQLDTIDNAPCGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAGVI FQTR 180 

Qy 614 KDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGFVWIQDMMERAIID 673 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db ^ 181 KDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGFVWIQDMMERAIID 240 

Qy 674 TFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAMTIQHIVAEKEH 733 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 241 TFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMXISWVYSVAMTIQHIVAEKEH 300 

Qy 734 RLKEVMKTMGLNNAVlIWAWFITGFVQLSISWALTAILKYGQVmHSHWIIWLFIAW 793 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 301 RLKEVMKTMGLNNAVHWVAWFITGFVQLS I SVT ALTAI LKYGQVLMHSHWI IWLFLAVY 360 

Qy 794 AVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHDKITAFEKCIASL 853 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AVATIMFCFLVSVLYSKAKLASA-GGIIYFLSYVPYMYVAIREEVAHDKITAFEKCIASL 419 

Qy 854 MSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVT^ 913 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II 
Db 420 MSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLMVDAVVYGILXWY 479 



Qy 

Db 



914 
480 



973 



539 



Qy 


974 


Db 


540 


Qy 


1034 


Db 


600 


Qy 


1094 


Db 


659 


Qy 


1154 


Db 


719 


Qy 


1214 


Db 


779 


Qy 


1274 


Db 


839 


Qy 


1334 


Db 


899 


Qy 


1394 


Db 


959 


Qy 


1454 


Db 


1019 


Qy 


1514 


Db 


1079 


Qy 


1574 


Db 


1139 


Qy 


1634 


Db 


1199 


Qy 


1694 


Db 


1259 


Qy 


1754 


Db 


1319 


Qy 


1814 



FEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTT 1033 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

FEETRGMEEEPTHLPLWXVDKLTKVYKDDKKLALNKLSLNLYENQGVS FLGHNGAGKTT 599 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 



SMAQEEI RREMDKMI EDLELSNKRHS LVQTLSGGMKRKLSVAI AFVGGSRAI I LDEPTAG 1153 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I 
SMAQEEI PREMDKMI EDLELSNKRHS LVQTLSGGMKRKVS VAI AFVGGSRAI I LDEPTAG 718 

VDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGS PLFLKGT Y 1213 

1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

VDPYARRAIWDLILKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGS PLFLKGT Y 778 

GDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTST 1273 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTST 838 

ELS YI LPS EAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQS LENS E 1333 
I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSGGDQSLENSG 898 

ADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDV 1393 
I I I I I I I I I I I M I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
ADVKES RKDVLPGAEGHASGEGHAGNLARCS ELTQSQASLQSAS SVGSALGDEGAGYTDV 958 

YGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARR 1453 
I II I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I II I I I I I I I I I I I I I II I I I I II I I I I 
YGDYPPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARR 1018 

NSKALFSQILLPAFFVCVAMTV7VLSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEER 1513 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

NSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEER 107 8 

REYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFD 1573 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
REYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFD 1138 

SMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPR 1633 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
SMCLESFTQGLPLSNFVPPPPSPAPSDS PAS PDEDLQAWNVSLPPTAGQEMWT SAPS LPR 1198 

LVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH 1693 
I I M I I I I I I I I I I I I I I I I : I I I I I I I | | I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
LVREPVRCTCSAQGTGFSCPNSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH 1258 

RYGAITFGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSMPT YLNS LNNAI LR 1753 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
RYGAITFGNVLKS I PAS FGTRAPPMVRKI RCARAAQVFYNNKGYHSMPTYLNSLNNAI LR 1318 

ANL P KS KGN PAAYGI T VTNH PMN KT S AS L S LD YLLQGTDWI AI FI I VAMS FVPAS FWF 1813 
I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
ANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMS FVPAS FWF 1378 

LVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNF 1873 



Db 



1379 



I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I Mil 

LVAEKS T KAKHLQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPAT CCVI I L FVFDL PAYT S PTN F 



1438 



Qy 1874 PAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDK 1933 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 1439 PAVLSLFLLYGWSITPIMYPASFWFEVPSSAYXFLIVINLFIGITATVATFLLQLFEHDK 1498 



Ov 


1934 


i^jjiw vino i jji\o^r J-j J. r it in i in jjvjrikjj_ii v iiiii v i/-\ i in JL I J. IN Hi I I /\r\ L b^r UruYLiAkD JrriliWJJXVl r\Vj±j 


1 QQ^ 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I 1 \ \ I 1 I 1 1 1 1 1 1 1 1 1 1 1 
1 1 M M M M M M 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 I 1 1 M 1 | 1 1 1 | II | M M 1 1 1 1 1 1 




Db 


1499 


DLKVVNSYLKSCFLIFPNYNLGHGLMFJ^YNEYINEYYAKIGQFDKMKSPFEWDIVTRGL 


1558 




1 QQ4 


"\/"A1vf Z\WET*\ A/V* TrT T TTMrnVM ITT do d^i DK/TDWC! r r i<t>t /it P\tv\ rr\\ rj\ ci?nr\m7T nrr\7\ r\M rvtv/n tis t 
v/\lYLrt.vriVjV VLrr J_i1j 1 IJYIL.^ i JN r ijKKry KlYlr Vb 1 J\r VEiJjDVD VAbrjKyKVLKCjUAUlN DIVlVKl 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 I II I II I | 1 || | | I I | 1 I | I | 1 | 




Db 


1559 


VAMAVE GVVG FL LT IMCQ YN FLRR PQ RMP VS TK P VE D DVDVAS E RQ RVL RGDADN DMVK I 


1618 






JZ>iMJjl f\V I riOKlMbKl Jj/\v UKljL^ij^VKiroJZiUr oJL JjLj VJN iaA(jr\ 1 Mr JyIYLIjI QjUHiO 1 1 LjLjriAr 


OHO 

Z 1 1 J 






I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 \ \ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 If 1 II 1 II 1 II 1 II 1 1 1 II 1 1 II 1 1 




Db 


1619 


ENLTKVYKSRKIGRILAVDRLCLGVRPGECFGXLGVNGAGKTSTFKMLTGDESTTGGEAF 


1678 


vy 


£. ± ± 4 




OTTO 
Zl 16 






1 1 1 1 1 1 t t 1 1 i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 
1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


1679 


VNGHSVLKELXQVQQSLGYCPQCDALFDELT7VREHLQLYTRLRGIXWKDEARWKWALEK 


1738 


yy 


9 1 HA 

Z JL / H. 


LiEiijI l\iAlJi\rA(3l i bCjbNKKJ\Lb 1 AlAJ_i±CjYFAr X r LDbrl I GMDPKARRFLWNLILDLI 


2233 






1 i 1 1 I 1 l 1 l l i l l l l l l l l l l I I l i I i i t i i i i > i i i i t i i i i i i i i i i i i i i i i i i i i i 
1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 II 1 1 II I I II 1 I I I I I I | 1 | || | | | II | | | | | j | 1 | | | | | 




Db 


1739 


LELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLI 


1798 


vy 


c. Z s5 4 


r\L bKovVLl onblYLiLJ^Ul^AliCI KLAlMVINICjKLKLLbolQnLKNKr GDGYMI I VKI KSSQSV 


o o c\ o 






I 1 1 1 1 1 1 \ 1 l 1 l l l l I l l i i t i i i i i i i i i i i i t i i i i i i i i i i i i i i t t i i i i i t i i i i 

II 1 II 1 II 1 1 II 1 II 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 II 1 1 1 II 1 1 II 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 




DO 


1 TQQ 


K1GK5 VVlil bHbMEhiCEALCI RLAIMWGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSV 


1858 


Qy 


2294 


KDVVT^FFNRN FPEAMLKERHHTKVQYQLKS EHI S LAQVFS KMEQVS GVLGI ED YS VSQTT 


2353 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1859 


KDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTT 


1918 


Qy 


2354 


LDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDT 


2413 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1919 


LDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDT 


1978 


Qy 


2414 


EDEGLISFEEERAQLSFNTDTLC 2436 








1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 




Db 


1979 


EDEGLISFEEERAQLSFNTDTLC 2001 





RESULT 9 
AAE16781 

ID AAE16781 standard; protein; 1771 AA. 
XX 

AC AAE16781; 
XX 

DT 09-APR-2002 (first entry) 
XX 

DE Human transporter and ion channel-18 (TRICH-18) protein. 
XX 

KW Human; transporter and ion channel-18; TRICH-18; neuroprotective; asthma; 

KW nootropic; cytostatic; cardiovascular; immunosuppressive; cardiomyopathy; 

KW antiinflammatory; protein therapy; akinesia; cystic fibrosis; leukaemia; 



KW Bell's palsy; amyotrophic lateral sclerosis; Alzheimer's disease; cancer; 

KW amnesia; dementia; myocarditis; Duchenne 1 s muscular dystrophy; AIDS; 

KW Acquired Immune Deficiency Syndrome; Addison's disease; allergy; angina; 

KW cell proliferative disorder; psoriasis; cardiac disease; hypertension; 

KW bradyarrythmia; gene expression; drug screening . 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Domain 119. .138 

FT /label= Transmembrane_domain 

FT Domain 228. .24 6 

FT /label= Transmembrane_domain 

FT Domain 353. .533 

FT /note= "ABC transporter domain" 

FT Binding-site 360. .367 

FT /label= P_loop 

FT /note= "ATP/GTP binding site" 

FT Domain 1128. .1148 

FT /label- Transmembrane_domain 

FT Domain 1180. .1197 

FT /label= Transmembrane_domain 

FT Domain 1235. .1261 

FT /label= Transmembrane_domain 

FT Domain 1416. .1597 

FT /note= "ABC transporter domain" 

FT Binding-site 1423. .1430 

FT /labels P_loop 

FT /note= "ATP/GTP binding site" 
XX 

PN WO200192304-A2. 
XX 

PD 06-DEC-2001. 
XX 

PF 25-MAY-2001; 2 0.01WO-US017065 . 
XX 

PR 26-MAY-2000; 2 000US-0208424P . 

PR 01-JUN-2000; 2000US-0209001P . 

PR 08-JUN-2000; 2000US-0210588P . 

PR 16-JUN-2000; 2000US-0212335P . 

PR 22-JUN-2000; 2000US-0213747P. 

PR 29-JUN-2000; 2000US-0215391P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Thornton M, Walia NK, Yue H, Nguyen DB, Lai P, Gandhi AR; 

PI Tribouley CM, Yao MG, Ramkumar J, Au- Young J, Lu Y, Tang YT; 

PI Azimzai Y,, Bruns CM, Griffin JA, Yang J, Sanjanwala MS, Raumann BE; 

PI Lee EA, Hafalia A, Baughn MR, Green BD, Khan FA, Kearney L; 

PI Elliot VS, Seilhamer JJ, Policky JL, Borowsky ML, Burford N, Ding L; 

PI Lu DAM, Hillman JL; 

XX 

DR WPI; 2002-122055/16. 

DR N-PSDB; AAD27271. 
XX 

PT New human transporters and ion channels (TRICH) polypeptides useful for 

PT diagnosing, treating or preventing disorders associated with aberrant 



PT expression of TRICH. 
XX 

PS Claim 1; Page 169-173; 210pp; English. 
XX 

CC The invention relates to human transporters and ion channels (TRICH) 

CC polypeptides and their cDNA molecules. The nucleic acid and polypeptide 

CC sequences are useful in the diagnosis, treatment, and prevention of 

CC disorders associated with transport (akinesia, cystic fibrosis, Bell's 

CC palsy, amyotrophic lateral sclerosis); neurological (Alzheimer's disease, 

CC amnesia, dementia); muscle (cardiomyopathy, myocarditis, Duchenne ' s 

CC muscular dystrophy); immunological (AIDS, Addison's disease, allergies, 

CC asthma); cell proliferative disorders (cancers, leukaemia, psoriasis); 

CC cardiac disease (angina, hypertension, or bradyarrythmia ) and in the 

CC assessment of the effects of exogenous compounds on the expression of 

CC nucleic acid and amino acid sequences of transporters and ion channels. 

CC The polynucleotides may be used to detect and quantify gene expression in 

CC biopsied tissues in which TRICH expression may be correlated with a 

CC disease, to generate hybridization probes for mapping naturally occurring 

CC genomic sequence, and in drug screening. The present sequence is human 

CC TRICH-18 protein 

XX 

SQ Sequence 1771 AA; 



Query Match 72.9%; Score 9237; DB 5; Length 1771; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1771; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


666 


MMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVT EHMMPLCMVISWVYSVAMTIQ 


725 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


MMERAI IDTFVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVI SWVYSVAMTIQ 


60 


Qy 


726 


HIVAEKEHRLKEVMKTMGLNNAVTiWVA 


785 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


HIVAEKEHRLKEWIKTMGLNNAVIiWAWFITGFVQLSISVTALTAILKYGQVI^SHVVI 


120 


Qy 


786 


I WLFLAVYAVAT I MFC FLVS VL Y S KAKLAS AC GG 1 1 YFL S YVP YMYVAI RE EVAH D KI TA 


845 






1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 




Db 


121 


IWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVP YMYVAI REEVAHDKITA 


180 


Qy 


846 


FEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLMVDAV 


905 






1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | 1 | | | | | | 




Db 


181 


FEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLIAVTMLMVDAV 


240 


Qy 


906 


WGILTWYIEAVIiPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQ 


965 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


VYGILTWYIEAVIiPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQ 


300 


Qy 


966 


ACAMESRRFEETRGMEEEPTHLPLWCV1)KLTKVYKDDKKLALNKLSLNLYENQVVSFLG 


1025 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


301 


ACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLG 


360 


Qy 


1026 


HNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEH 


1085 






1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


361 


HNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEH 


420 


Qy 


1086 


LWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAI 


1145 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



) 



Db 421 LWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAI 480 

Qy 1146 ILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGS 1205 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1. 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 ILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGS 540 

Qy 1206 PLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASC 1265 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 PLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASC 600 

Qy 1266 LLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEE 1325 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I 
Db 601 LLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEE 660 

Qy 1326 DQ S L EN S EAD VKE S RK D VL P GAE G PAS GEGHAGN LARC S E LT Q S QAS LQ S AS S VG S ARG D 1385 

I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 DQ S L EN S EADVKE S RKDVLPGAEG PAS GEGHAGN LARC SELTQS QAS LQ S AS SVG SARGD 720 

Qy 1386 EGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLV 1445 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 721 EGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLV 780 

Qy 1446 KRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNF 1505 

J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 KRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNF 840 

Qy 1506 IPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESR 1565 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 IPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESR 900 

Qy 1566 LLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMW 1625 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 901 LLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMW 960 

Qy 1626 TSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLF 1685 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 TSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLF 1020 

Qy 1686 TSDRFRLHRYGAITFGNVLKSIPAS FGTRAP PMVRKIAVRRAAQVFYNNKGYHSMPTYLN 1745 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 TSDRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKIAVRRAAQVFYNNKGYHSMPTYLN 1080 

Qy 1746 SLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSF 1805 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 SLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSF 1140 

Qy 1806 VPASFV\/FLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLP 1865 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 VPAS FWFLVAEKSTKAKHLQFVS GCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I LFVFDLP 1200 

Qy 18 66 AYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFL 1925 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 AYT S P TN F PAVL S L FLL YGW S I T P I MY PAS FW FE VP S S AYVFL I VI NL F I G I TAT VAT FL 1260 

Qy 1926 LQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFE 1985 

I I I I I I 1 1 I I I I I I 1 1 I I I I I I I I I Ml I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I 

Db 1261 LQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFE 1320 



Qy 
Db 



1986 
1321 



WDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGD 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
WDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGD 



2045 
1380 



204 6 ADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDE 2105 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

1381 ADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDE 1440 

2106 STTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEAR 2165 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I 
1441 STTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEAR 1500 

2166 WKWALEKLELTKYADKPAGTYSGGNKRKLSTAI7VLIGYPAFIFLDEPTTGMDPKARRFL 2225 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1501 WKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFL 1560 

2226 WNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITV 2285 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1561 WNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITV 1620 

2286 RTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIE 2345 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I -I I I I I I I 
1621 RTKSSQSVKDVVRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIE 1680 

2346 DYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVA 2405 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1681 DYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVA 1740 

2406 DEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1741 DEPEDLDTEDEGLISFEEERAQLSFNTDTLC 1771 



RESULT 10 
AAB38110 

ID AAB38110 standard; protein; 2261 AA. 
XX 

AC AAB38110; 
XX 

DT 29-JAN-2001 (first entry) 
XX 

DE Human ABC1 cholesterol transporter mutant, V399A. 
XX 

KW Human ABC1 cholesterol transporter; chromosome 9q31; 

KW ATP-binding cassette; HDL deficiency disorder; high density lipoprotein; 

KW Tangier disease; TD; familial HDL deficiency; FHA; polymorphism; 

KW cardiovascular disease; coronary artery disease; coronary restenosis; 

KW cerebrovascular disease; peripheral vascular disease; 

KW Alzheimer's disease; Niemann-Pick disease; Huntington's disease; 

KW X-linked adrenoleukodys trophy; cancer; gene therapy; genetic diagnosis; 

KW prognosis; prophylaxis; drug screening; transgenic animal; mutant; 

KW mutein. 

XX 

OS Homo sapiens. 
XX 

PN WO200055318-A2. 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



XX 

PD 21-SEP-2000. 
XX 

PF 15-MAR-2000; 2 000WO-IB000532 . 
XX 

PR 15-MAR-1999; 99US-0124702P . 

PR 08-JUN-1999; 99US-0138048P . 

PR 17-JUN-1999; 99US-0139600P . 

PR 01-SEP-1999; 99US-0151977P . 
XX 

PA (UYBR-) UNIV BRITISH COLUMBIA. 

PA (XENO-) XENON BIORESEARCH INC. 
XX 

PI Hayden MR, Wilson AR, Pimstone SN; 
XX 

DR WPI; 2000-587528/55. 
XX 

PT New ABC1 polypeptide is useful for treating diseases associated with ABC1 

PT biological activity, e.g. Alzheimer's disease, Huntington's disease and 

PT cancer. 
XX 

PS Example; Page; 22 9pp; English. 
XX 

CC The invention relates to the human ABC1 cholesterol transporter protein 

CC (B38082) and to nucleic acid sequences (C69120) which encode it. ABC1 is 

CC a member of the ATP-binding cassette (ABC transporter) superfamily of 

CC proteins, and plays a crucial role in cholesterol transport, particularly 

CC intracellular cholesterol trafficking in monocytes and fibroblasts, being 

CC involved in cholesterol efflux from the cell. The gene encoding ABC1 is 

CC located on chromosome 9q31, and mutations in this gene are associated 

CC with two genetic HDL (high density lipoprotein) deficiency disorders, 

CC Tangier disease (TD) and familial HDL deficiency (FHA) . These diseases 

CC are distinguishable in that TD is an autosomal recessive disorder, while 

CC FHA is inherited as an autosomal dominant trait. Low levels of HDL ("good 

CC cholesterol") in the blood correlate with a high risk of cardiovascular 

CC disease, particularly coronary artery disease, but also cerebrovascular 

CC disease, coronary restenosis, and peripheral vascular disease. 

CC Conversely, a high level of HDL has protective effects against 

CC cardiovascular disease. The invention provides genetic constructs and 

CC transgenic cells and non-human animals comprising human ABC1 nucleic 

CC acids, and methods of gene therapy for the treatment or prevention of 

CC cardiovascular disease comprising the administration of an expression 

CC vector encoding ABC1 or an active fragment thereof. The invention also 

CC encompasses compounds which mimic ABC1 activity, compounds which 

CC stimulate ABC1 expression and methods of screening for such compounds. It 

CC further relates to methods for determining whether a patient has an 

CC increased risk for cardiovascular disease due to polymorphisms in the 

CC ABC1 gene. Human ABC1 proteins and nucleotides can be used to treat or 

CC prevent cardiovascular disease, especially coronary artery disease, 

CC cerebrovascular disease, coronary restenosis or peripheral vascular 

CC disease. They may also be used in the treatment of diseases associated 

CC with ABC1 biological activity, such as Alzheimer's disease, Niemann-Pick 

CC disease, Huntington's disease, X-linked adrenoleukodystrophy and cancer. 

CC The invention specifically excludes proteins with the exact amino acid 

CC sequences of GenBank Accession No: CAA10005.1 and X75926, and the nucleic 

CC acid with the exact sequence as GenBank Accession No:. AJ012376.1. The 

CC present sequence represents a mutant human ABC1 cholesterol transporter 



CC associated with an altered cholesterol level and therefore an altered 

CC risk of cardiovascular disease. Note: The present sequence is not shown 

CC in the specification, but is derived from the native human ABCl shown on 

CC pages 152-157 
XX 

SQ Sequence 2261 AA; 

Query Match . 33.5%; Score 4244.5; DB 3; Length 2261; 

Best Local Similarity 39.9%; Pred. No. 1.2e-307; 

Matches 1001; Conservative 345; Mismatches 729; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNWLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I 1 I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I :: I : 

Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR— QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : | | | : : | : : I I I I I I I I I I 

Db 121 SMKDMRKVLRTLQQIKKSSSNLKLQDFLVDNETFSG FLYHNLSLP 165 

Qy 174 NSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : I I : : 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II | | : : : | : | : I I : : I : 

Db 205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAFIA--TKTLLHSLGTIxAQELFSMRSWSDMRQEvT^FLTNWS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GT GAGAVMG PNAT AE E GAP S AAALAT P 396 

: : I : III : I : I I I I : II 

Db 2 94 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 

Qy 397 DTLQGQCSAFVQ — LWAGLQP I LCGNNRT I EPEALRRGNMS S LGFTS KEQRNLGLLV 451 

I : : I : : : I I : I : I I 
Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy . 452 HLMTSNPKILYAPAGSEVDRVILKANETFAFVGNVTHYAQVWLNISAEIRSFLEQGRLQQ 511 

. I I I I I : I : : I I : I I : : I : I : I : I : I : 

Db 382 KILYTPDTPATRQVMAEANKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

: I I I I I I I I I : I I : I : I 

Db 4 35 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN — 492 

Qy 548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

1:11 : | | | : : : : | : : | : : | : | 

Db 4 93 QAIRTIS RFMECVNLNKLEP IATEVWLINKSME — LLDERKFWAG 535 

Qy 608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

::| II II II MM : |:||:|: || III I II : 



Db 



536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMRYVWGGFAY 595 



Qy 663 IQDMMERAI I DTFVGHDVVEPGS YVQMFPYPCYTRDDFLFVT EHMMPLCMVI SWVYSVAM 722 

: I I : : I : I I I I : : I I : I I I I I I II h I I I I : : I : I I I I : 

Db 596 LQDVVEQAIIRVLTGTE-KKTGVYMQQMPYPCWDDIFLRVMSRSMPLFMTIAWIYSVAV 654 

Qy 723 T I QHI VAEKEHRLKEVMKTMGLNNAVHWVAWFI TGFVQL S I SVT ALTAI LKYGQVLMHSH 782 

I : I I I I I I I I I I : I I I : I : : I : I I I : : I : I I I I I I : I : I 
Db , 655 1 1 KGI VYEKEARLKETMRIMGLDNS I LWFSWFI S SLI PLLVSAGLLWI LKLGNLLPYSD 714 

Qy 783 WIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHDK 842 

: : : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
Db 715 PSWFVFLSVFAWTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

Qy 843 I TAFE- KCI AS LMSTTAFGLGS KYFALYEVAGVGIQWHT FSQS PVEGDDFNLLLAVTMLM 901 

I I I I I : I I I I I : I I I I : I I : I : I I MINIMI : I : I : : 

Db 770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

Qy 902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I : : II I I M I I I I I : I II I I I I Mill Mill: Ml 

Db 830 FDTFLYGVMTWYI EAVFPGQYGI PRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS — 884 

Qy 962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I II I I I I I I I : I I I M I M I : : M II I I M 

Db 885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

Qy 1022 S FLGHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMC PQHNVLFDRLT 1081 

I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I M II I M I I M I I I II I I I I II 

Db 930 S FLGHNGAGKTTTMS I LTGLFP PTSGTAYI LGKDI RS EMSTI RQNLGVCPQHNVLFDMLT 989 

Qy 1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

II I M I I M I I I : : : : : : I M M I : I Ml I I I I I M I I I II M I I I I 

Db 990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 1049 

Qy 1141 GS RAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

I M : I I I I I I I I I II M II I M M I I I : I I I M I I II I I I I I M I I II II I II II I I 
Db 1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

Qy 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I II I I I I I II I I I I M I : I I I 

Db 1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

Qy 1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 12 98 

I : I I I I I M III I M M M I I I M II I II : : 
Db 1170 DHESDTLTIDVS— AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

Qy 1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I M M : I II I I : I M M II I I M II 
Db 1228 RLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 

Qy 1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

I I : I : I II |:||:::: 

Db 1268 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

Qy 1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 1474 

I : II : M I I : : I I Ml MM I I M I M M : I I I II M I : 
Db 1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 



Qy 1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : | I :|:: 

Db 1364 FS LI VP P FGKYP S LELQPWMYNEQ YT FVSNDAPE DTGTLELLNAL 14 08 

Qy 1534 RLPSGVGATCVLKS PANGS LGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: :| |: 
Db 1409 TKDPGFGTRCM EGNPI 1424 

Qy 1594 P S PAP S D S PAS P DE DLQAWNVS L P P TAG P EMWT SAP S L P RLVRE P VR C 1641 

II I - I I I I | : | I : | : : : : I 

Db 1425 PD TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 1463 

Qy 1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II II III: I I I I I : I I I : I : I I : I : 

Db 14 64 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQII7\KSLKN 1523 

Qy 1691 RLHRYGAI T FG- -NVLKS I PAS FGT RAP PMVRK 17 21 

III : I I I: I : : I 

Db 1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

Qy 1722 IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 1781 

: I : I : : I I I I : I : : : : I I : I I | I I | I | | | : | | : M I I I I I : I I 
Db 1584 LDTRNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

Qy 1782 LS-LDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 

II: : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I I I I : I I I : I I I I : I 

Db 1643 L S EVALMT T S VDVLVS I C VI FAMS FVPAS FWFL I Q ERVS KAKH LQ F I S GVK P VI YWL S N 1702 

Qy 1841 YVWDMLNYLVPATCCVI I LFVFDLPAYTS PTNFPAVLSLFLLYGWS I TPIMYPAS FWFEV 1900 

: M I I I I : I I I I : I I I : I I I I I : I I I I I I I I I I : I I I I I I I : : 
Db 1703 EVWDMCNYVVPATLVIIIFICFQQKSYVSSTNLPVLALLLLLYGWSITPLMYPAS FVFKI 1762 

Qy 1901 PSSAYVFLIVTNLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLG.HGLME 1960 

I I : I I I I : I I I I I I : I I I I : I : M I I : I I M I I I I I : : I I I I : : 
Db 1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : : : II I I : I I I I I I I I I I I I I : I :: I I I I I : 
Db 1822 MVKNQ7\MADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGVVFFLITVXIQYRFF 1880 

Qy 2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I : I I I I Nihil I I : : : I : I I I : I : : I I I I I : I : I : 
Db 1881 VNAKLSPLNDEDEDVRRERQRILDGGGQNDILEIKELTKIYRRK RKPAVDRICVGIP 1937 

Qy 2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I: I I I I I I I I I : I I : I I : I : I : I : : I I : : I I I I I I I : 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: II Ml:: : |||: |: : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLST7\M 2057 

Qy 2200 ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTGRSWLTSHSMEECEALCTRLAIM 2259 

I I I I I : I I I I I I I I I I I I I I I I I I I I ::| I I I I I I I I I I I I I I I I I I I I : I I I 
Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSVVKEGRSWLTSHSMEECEi\LCTRMAIM 2117 



Qy 2260 WGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I: : I I I I || ::|||:| :| 
Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

Ml I I I I : : I I : I I I I I I I I I I I I I I II I I I I I III: 
Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



RESULT 11 
AAB31363 

ID AAB31363 standard; protein; 2261 AA. 
XX 

AC AAB31363; 
XX 

DT . 20-APR-2001 (first entry) 
XX 

DE Amino acid sequence of ABC1 polypeptide from Tangier disease patient. 
XX 

KW Human; adenosine triphosphate binding cassette protein 1; ABC1; 

KW apolipoprotein-mediated mobilisation; cholesterol; Tangier disease; 

KW chromosome 9q22-9q31; heart disease; hypercholesterolemia; 

KW atherosclerosis; cholesterol transport, 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Misc-dif ference 587 

FT /note= "this is changed from Arg to Trp in Tangier 

FT disease" 

XX 

PN WO200078972-A2. 
XX 

PP 28-DEC-2000. 
XX 

PF 16-JUN-2000; 2000WO-US016765 . 
XX 

PR 18-JUN-1999; 99US-0140264P . 

PR 14-SEP-1999; 99US-0153872P . 

PR 19-NOV-1999; 99US-0166573P . 
XX 

PA (CVTH-) CV THERAPEUTICS INC. 
XX 

PI Lawn RM, Wade, D, Garvin M; 
XX 

DR WPI; 2001-137812/14. 
XX 

PT Adenosine triphosphate (ATP) binding cassette (ABC) polynucleotide, 

PT useful for the development of agents for the treatment of heart disease 

PT and other disorders associated with hypercholesterolemia and 

PT atherosclerosis. 

XX 

PS Disclosure; Page 176-191; 215pp; English. 
XX 

CC The present sequence represents a human adenosine triphosphate (ATP) 

CC binding cassette protein (ABC) 1 polypeptide, and is isolated from a 

CC Tangier disease patient. ABC1 resides in cell membranes and utilises ATP 



CC hydrolysis to transport a wide variety of substrates across the plasma 

CC membrane. ABC1 is a pivotal protein in the apolipoprotein-mediated 

CC mobilisation of intracellular cholesterol stores. ABC1 is defective in 

CC Tangier disease, a genetic disorder characterised by abnormal HDL- 

CC cholesterol metabolism. The ABC1 gene is localised to. chromosome 9q22- 

CC 9q31. The ABC1 genes and proteins are useful for developing 

CC pharmaceutical agents for the treatment of heart disease and other 

CC disorders associated with hypercholesterolemia and atherosclerosis. The 

CC genes are useful for developing screening assays to screen for compounds 

CC that regulate the expression of genes associated with cholesterol 

CC transport. The genes and proteins are also useful for are also useful as 

CC diagnostic indicators of cardiovascular disease and other disorders 

CC associated with hypercholesterolemia 

XX 

SQ Sequence 2261 AA; 



Query Match 33.5%; Score 4244.5; DB 4; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 1.2e-307; 

Matches 999; Conservative 348; Mismatches 728; Indels 435; Gaps 61; 



Qy 


6 


QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 
1 1 : 1 1 1 1 1 1 : 1 : 1 1 1 1 : 1 1 : 1 1 1 : : 1 1 1 1 I : | | 
QLRLLLWKNLTFRRRQTCQLLLEVAWPLFI FLILI SVRLS YPPYEQHECHFPNKA-MPSA 


65 


Db 


6 


64 


Qy 


66 


GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 

1 1 1 : 1 : : 1 : : I I : I I : : | : 
GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 


115 


Db 


65 


120 


Qy 


116 


SLGSELEALR— QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 
1 : : 1 1 1 : : 1 : : 1 1 1 1 1 1 1 1 1 1 
SMKDMRKVLRT LQQI KKSS SNLKLQDFLVDNET FS G FLYHNLS L P 


173 


Db 


121 


165 


Qy 


174 


NSTAQALLAARVDPPEVTHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 

II : 1 1 1 : 1 : 1 1 III : II :: 
KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 


233 


Db 


166 


204 


Qy 


234 


LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 2 93 
I I I I ... i i . i i . . i . 


Db 


205 


ii i i • - - i • • i i i . . i . 

QL GDQEVSELCGLPKEKLAAAE RVLRSNMDI 


235 


Qy 


294 


AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 


341 


Db 


236 


1 : : 1 : 1 1 : 1 : 1 III 1 : 1 : : 1 1 
LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 


293 


Qy 


342 


ALALLLPQGACTGRTPGPPASGAGGAAN GT GAGAVMG PNAT AEE GAP S AAALAT P 


396 


Db 


294 


: : 1 : III : 1 : 1 1 1 1 : 1 1 
SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 


353 


Qy 


397 


DTLQGQCSAFVQ — LWAGLQ P I L C GNN RT I E P EAL RRGNMS SLGFTSKEQRNL GL L V 

1 : : 1 : : : 1 |:|:| 1 


451 


Db 


354 


YCNDLMKNLESSPLSRIIWKALKPLLVG 


381 


Qy 


452 


HLMTSNPKI LYAPAGS EVDRVI LKANET FAFVGNVTHYAQVWLNI SAEI RS FLEQGRLQQ 

Mill : 1 : : 1 : 1 1 : : 1 : 1 : 1 : 1 : 1 : 
---KILYTPDTPATRQVMAEWKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 


511 


Db 


382 


434 


Qy 


512 


HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 


547 



: I I I I I I I I I : I I : I : I 

435 LVRMLLDS RDNDH FWEQQLDGLDWTAQDI VAFLAKH P EDVQS SNGS VYTWREAFNETN- - 4 92 

548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIWYTLNQAYQDNVTVFAS 607 

I : I I : | | | : : : : | : : | : : | : | 

4 93 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME — LLDERKFWAG 535 

608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

: : I II I I I I I I I I : I : I I : I : I I I I I : I I I : 

536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMWYVWGGFAY 595 

663 IQDMMERAI I DTFVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVI SWVYSVAM 722 

: I I : : I : I I I I : : I I : I I I I I I I I I I : I I I I : : I : I I I I : 

596 LQ DWEQAI I RVLT GT E- K KT GVYMQQM P Y P C YVDD I FL RVMS R SM P L FMT LAW I Y S VAV 654 

723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

1 : I I I I I I I I I I : I I I : I : : I : I I I : : | : | I I I I I : I : I 
655 IIKGIVYEKE7VRLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

783 WI I WL FLAVYAVAT I M FC FLVS VL Y S KAKLAS AC GGIIYFLS YVP YM YVAI RE E VAH D K 842 

: : : : II : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
715 PSWFVFLSVFAVVTILQCFLISTLFSRANLT^AACGGIIYFTLYLPYVLC VAWQD 769 

843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I I I I : I III I : I I I I : I I :'| : I I : I I I I I I I I : : : I : : 

770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSISMML 82 9 

902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I :: I I I I I I I I I I I : I I I I I I I I I I I I hill: I : I 

830 FDTFLYGVMTWYI EAVFPGQYGI PRPWYFPCTKSYWFGE ESDEKSHPGSNQKRMS — 884 

962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I I I I I I I I I I : lllhl I : I : : I : I I I I I : 

885 — EIC — MEEE'PTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 92 9 

1022 S FLGHNGAGKTTTMS I LTGLFP PT SGSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRLT 1081 
I I II I II I I I I I I I I I I I I I I I I I I I : I I I |||:|| 11:111:111111111 II 
930 S FLGHNGAGKTTTMS I LTGLFPPTSGTAYILGKDIRSEMSTIRQNLGVCPQHNVLFDMLT 989 

1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 
I I M : I I I : I I I : : : : : : I I : : I I : I I : I I I I I I I : I I I I I I : I I I I 

990 VEEHIWFY/^LKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 

1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

II: : I I I I I I II I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I I I 
1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

1201 KCCGS PLFLKGT YGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I II I I I I I I I I : I : I I I 

1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYI LPSEAAKKGAFERLFQHLER 1298 

I : I Mill: III I I : I : II I I I I : I I I II : : 
1170 DHESDTLTIDVS — AI SNLI RKHVSEARLVEDI GHELT YVLPYEAAKEGAFVELFHEI DD 1227 



1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 
I I : I I : I : : I I I I I : I I I I : II I I I : II 



Db 1228 RLSDLGISSYGISETTLEEIFLKVAEE ■ — SGVDA-ETSDGTLP 1267 



Qy 


1359 


Db 


1268 


Qy 


1416 


Db 


1304 


Qy 


1475 


Db 


1364 


Qy 


1534 


Db 


1409 


Qy 


1594 


Db 


1425 


Qy 


1642 


Db 


1464 


Qy 


1691 


Db 


1524 


Qy 


1722 


Db 


1584 


Qy 


1782 


Db 


1643 


Qy 


1841 


Db 


1703 


Qy 


1901 


Db 


1763 


Qy 


1961 


Db 


1822 


Qy 


2021 


Db 


1881 


Qy 


2080 


Db 


1938 



NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

II: I : I II I : M : : : : 
ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 



EAEAL S RV- GQGSRKLDGGWL KVRQ FH G L L VK R FH C ARRN S KAL F S Q I L L P AF FVC VAMT 

I : I I : I : I I : : I I : I I I I I I I I I : I I : I I ': I I I I I I = I ' 
ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 



1474 



1363 



1533 



VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 

:| II I I I I I I: II : : : I I : I : : 

FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE — DTGTLELLNAL 14 08 



I I I: 



. : I I : 

TKDPGFGTRCM EGNPI 1424 

P S PAP S D S PAS P DEDLQAWNVS L P P TAG P EMWT SAP S L P RLVRE PVR C 1641 

II | | | | | | : | | : | : : : : I 
PD T PCQAGEEEWTTAP- VPQTIMDLFQNGNWTMQN P S PAC 14 63 

TCSAQGTGFS CPS S VGG- H P PQMRWT GD I LT D I T GHNVS E YLL FT S DRF 1690 

II: II II III: I I I I I : II I : I : I I : I : 

QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 



I I I 



I I 



I : 



1721 



1781 



IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 

: I : I : : I I I I : I :: : : I I : I I I I I I I I I I • I I : MM II I : I I 
LDTRNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

LS-LDYLLQGTDWIAI FI IVAMSBVPASFWFLVAEKSTKAKHLQFVSGCNPI I YWLAN 1840 

II: : II : : : I : I II II I I I I II M I : I : : I I II I I I : I I I M I I I : I 

LS EVALMTT S VDVLVS I CVI FAMS FVPAS FWFLI QERVS KAKHLQFI S GVKPVI YWLSN 17 02 

YVWDMLNYLVPATCCVI ILFVFDLPAYTSPTNFPAVLSLFLLYGWS ITPIMYPAS FWFEV 1900 

: || II II : I I II : I I I : I I I I I : I II I I II 1.1 I M I I I I I I : : 
FVWDMCNYWPATLVI 1 1 FI CFQQKS YVS STNLPVLALLLLLYGWS ITPLMYPAS FVFKI 1762 

PS SAYVFLI VINLFI GI TATVATFLLQLFEHDKDLKWNS YLKSCFLI FPNYNLGHGLME 1960 

11:1111 : I II II I : II I I : I : I I I I : I I I I I I I I I : : M I I : : 
PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 



I I 



I : 



II I I : I I I I II II I II I I : I :: II I II 



Mllll IMIMI 



I I 



I : I II : I : 



I I I I I : I : I 



2139 



PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 

I II I I I II II II II I : II II II I I I : I I : II : I MM : M | : : I M I I I I ' 

PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 



Qy 2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: I I I I I :: : I I I : I : : I : I I. : I I I I I : I I I I I I I I I I I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSVVLTSHSMEECE7UjCTRLAIM 2259 

I I I I I : I I I I I I I I I I I II I I I I I I I :: I I I I I I I I II I I I I I I I I I I I : I 1 I 

Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSVVKEGRSWLTSHSMEECEALCTRMAIM 2117 

Qy 2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I I II I I : : I I I : I : I 

Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

I I I I I I I : : I I : I I I I I I I I I I I I I I I I I I I I I I I I ? 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



RESULT 12 
AAB31367 

ID AAB31367 standard; protein; 2261 AA. 
XX 

AC AAB31367; 
XX 

DT 20-APR-2001 (first entry) 
XX 

DE Amino acid sequence of ABC1 polypeptide from Tangier disease patient. 
XX 

KW Human; adenosine triphosphate binding cassette protein 1; ABC1; 

KW apolipoprotein-mediated mobilisation; cholesterol; Tangier disease; 

KW chromosome 9q22-9q31; heart disease; hypercholesterolemia; 

KW atherosclerosis; cholesterol transport. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 
FT Misc-dif ference 587 

FT /note= "this is changed from Arg to Trp in Tangier 

FT disease" 

XX 

PN WO200078971-A2. 
XX 

PD 28-DEC-2000. 
XX 

PF 16-JUN-2000; 2000WO-US016591 . 
XX 

PR 18-JUN-1999; 99US-014 0264P . 
PR 14-SEP-1999; 99US-0153872P . 
PR 19-NOV-1999; 99US-0166573P . 
XX 

PA (CVTH-) CV THERAPEUTICS INC. 
PA (UNIW ) UNIV WASHINGTON. 
XX 

PI Lawn RM, Wade D, Oram JF, Garvin M; 
XX 

DR WPI; 2001-137811/14. 
DR N-PSDB; AAF24708. 



PT Adenosine triphosphate (ATP) binding cassette protein (ABC) 1 

PT polynucleotides and polypeptides, useful for treatment of heart disease 

PT and other disorders associated with hypercholesterolemia and 

PT atherosclerosis. 

XX 

PS Claim 28; Page 172-187; 211pp; English. 
XX 

CC The present sequence represents a human adenosine triphosphate (ATP) 

CC binding cassette protein (ABC) 1 polypeptide, and is isolated from a 

CC Tangier disease patient. ABC1 resides in cell membranes and utilises ATP 

CC hydrolysis to transport a wide variety of substrates, across the plasma 

CC membrane. ABC1 is a pivotal protein in the apolipoprotein-mediated 

CC mobilisation of intracellular cholesterol stores. ABC1 is defective in 

CC Tangier disease, a genetic disorder characterised by abnormal HDL- 

CC cholesterol metabolism. The ABC1 gene is localised to chromosome 9q22- 

CC 9q31. The ABC1 genes and proteins are useful for developing 

CC pharmaceutical agents for the treatment of heart disease and other 

CC disorders associated with hypercholesterolemia and atherosclerosis. The 

CC genes are useful for developing screening assays to screen for compounds 

CC that regulate the expression of genes associated with cholesterol 

CC transport. The genes and proteins are also useful for are also useful as 

CC diagnostic indicators of cardiovascular disease and other disorders 

CC associated with hypercholesterolemia 

XX 

SQ Sequence 2261 AA; 

Query Match 33.5%; Score 4244.5; DB 4; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 1.2e-307; 

Matches 999; Conservative 348; Mismatches 728; Indels 435; Gaps 61 

Qy 6 QLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPT I SVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I II : I : : I : : I I : I I : : I : 
Db . 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR--QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : | | I :: | : : I I I I I I I I I I 

Db 121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNET FS G FLYHNLSLP 165 

Qy 174 NSTAQALIJyVRVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II MM : I : I I III : ||:: 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II | | : : : | : : | : I I : : I : 

Db 205 QL GDQEVSELCGLPKEKLAAAE RVLRSNMDI 235 

Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GT GAGAVMG PN AT AE E GAP S AAALAT P 396 

: : I : III : I : I I I I : | | 



Db 



294 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 



Qy 397 DTLQGQCSAFVQ— LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I : : I : : : I I : I : I I 
Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 452 HLMT SN P KI L YAP AG S EVD RVI LKANET FAFVGNVTH YAQVWLN I S AE I RS FLEQGRLQQ 511 

I I I I I : | : : | : | | : : | : | : | : | : | : 

Db 382 KILYTPDTPATRQVMAEVNKTFQEIAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL — QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

: I I I I I I I I I : I I : I : I 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN — 492 

Qy 548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIWYTLNQAYQDNVTVFAS 607 

I : -I I : | | | : : : : | : : | : : | : | 

Db 4 93 QAIRTIS — RFMECVNLNKLEPIATEVWLINKSME— LLDERKFWAG 535 

Qy 608 VI FQTRKDGS- -LPPHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTG GRFYFLYGFVW 662 

:: I II I I I I I I I I : I : I I : I : I I I I I : I I I : 

Db 536 I VFTGITPGS I ELPHHVKYKI RMDI DNVERTNKI KDGYWDPGPRADPFEDMWYVWGGFAY 595 

Qy 663 IQDMMERAI I DTFVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVI SWVYSVAM 722 

: I I : : I : I I I I : : I I : I I I I I I Mil: I I I I : : I : I I I I : 

Db 596 LQDWEQAIIRVLTGTE-KKTGVYMQQMPYPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 654 

Qy 723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I : I I I I I I I I I I : I I I : I : : I : I I I : : I : I I I I I I : I : I 
Db 655 IIKGIWEKEj\RLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

Qy 783 WIIWLFIJVWAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHDK 842 

: : : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
Db 715 PSWFVFLSVFAWTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

Qy 843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I I I I : I I I I I : I I I I : I I : I : I I Mill I III : : : I : : 
Db 770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSISMML 829 

Qy 902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I :: I I I I I I I I I I I : I I I I I I I I I I I I I .: I I I : hi 

Db 830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRMS-- 884 

Qy 962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I I I I I I II I I : I I I I : I I : I : : I : I I I I I : 

Db 885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

Qy 1022 S FLGHNGAGKTTTMS I LTGL FPPT S GSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRLT 1081 

I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I llhll I I : I I I : I I I I I I I I I II 
Db 930 S FLGHNGAGKTTTMS I LTGL FP PT S GTAYI LGKDI RS EMST I RQNLGVCPQHNVLFDMLT 989 

Qy 1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

lllhllhlll :::: :: l|::| I : I I - I I I I I I I : I I I I I I : I I I I 
Db 990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 

Qy 1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

II: : I I I I I I I I I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I II 
Db 1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 



1201 KCCGS PLFLKGT YGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I I I 11111111:1 : I I I 

1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : I Mill: III I I : I : I I I I I I : I I I II : : 

1170 DHESDTLTIDVS — AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : I I I I I : I I II : I I I I I : II 

1228 RLSDLGI S S YGI SETTLEEI FLKVAEE SGVDA-ETSDGTLP 1267 

1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

II: I : I II I : I I : : : : 

12 68 ARRNRRA- FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 1474 

I : I I : I : I I : : I I : I I I I I I I I I : I I : I I : I I I I I I : I : 

1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

1475 VALS VPEI GDLPPLVLS PSQYH-NYTQPRGNFI P YANEERREYRLRLS PDAS PQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I : I : : 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE -DTGTLELLNAL 1408 

1534' RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: : I I : 
1409 TKDPGFGTRCM EGNPI 1424 

1594 P S PAP S D S PAS P D ED LQ AWN VS L P P TAG P EMWT SAP S L P RL VRE PVR C 1641 

II I I I I I I : I I : I : : : : I 
1425 PD T P CQAG E E EWT TAP - VP QT I MD L FQN GNWTMQN P S P AC 1463 

1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II II III: I I I I I : I I I : I : I I : I : 

1464 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

1691 RLHRYGAI T FG — NVLKS I PAS FGTRAP PMVRK 1721 

111:11 I : I :: I 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

1722 IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 1781 

: I : | : : | | | | : | : : : : | | : | | | M I I I I I : I I : I I I I I I I : I I 
1584 LDTRNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

1782 LS-LDYLLQGTDVVIAIFIIVAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 
II: : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I I I I : I I I : I I II : I 

1643 L S EVALMT T S VD VL VS I C VI FAMS FVP AS FWFL I Q E RVS KAKH LQ FI S GVK P VI YW L S N 1702 

1841 YVWDMLN YLVPAT CC VT I L FVFDL PAYT S PTN FPAVL S L FLL YGWS I T P I MY PAS FWFEV 1900 

: I I I I I I : I I I I : I I I : I I I I I : I I I I I I I I I I : I I I I I I I : : 
1703 FVWDMCNYWPATLVIIIFICFQQKSYVSSTNLPVLALLLLLYGWSITPLMYPASFVFKI 1762 

1901 PSSAWFLIVINLFIGITATVATFLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLME 1960 

11:1111 : I I I I I I : I I I I : I : I I I I : I I I I I I I II : : I I I I : : 
1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 



Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : :: II I I : I I I I I I I I I I I I I : I :: I I I II: 
Db 1822 MVKNQAMM)ALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

Qy 2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I : I I I I I I I I : I I I I : : : I : I I I : I : : I I I I I : I : I : 
Db 1881 WAKLSPLNDEDEDVRRERQRILDGGGQNDILEIKELTKIYRRK RKPAVDRICVGIP 1937 

Qy 2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I : I I I I I I I I I : I I : I I : I : I : I : : I I : : I I I I I I I : 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: II III:: : III: I : : I : I I : II I II : I II I I I I I I I I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAI RKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 

I I I I I : I I I I I I I I I I I I I I I I I I I I : : I I I I I I I I I I I I I I I I I I I I I : I I I 
Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM. 2117 

Qy 2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I I II I I : : I I I : I : I 
Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

III I I I I : : I I : I I I I I I I I I I I I I I I I I I I I I III: 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 v 



RESULT 13 
AAB38109 

ID AAB38109 standard; protein; 2261 AA. 
XX 

AC AAB38109; 
XX 

DT 29-JAN-2001 (first entry) 
XX 

DE Human ABC1 cholesterol transporter mutant, R219K. 
XX 

KW Human ABC1 cholesterol transporter; chromosome 9q31; 

KW ATP-binding cassette; HDL deficiency disorder; high density lipoprotein; 

KW Tangier disease; TD; familial HDL deficiency; FHA; polymorphism; 

KW cardiovascular disease; coronary artery disease; coronary restenosis; 

KW cerebrovascular disease; peripheral vascular disease; 

KW Alzheimer's disease; Niemann-Pick disease; Huntington's disease; 

KW X-linked adrenoleukodystrophy ; cancer; gene therapy; genetic diagnosis; 

KW prognosis; prophylaxis; drug screening; transgenic animal; mutant; 

KW mutein . 

XX 

OS Homo sapiens. 
XX 

PN WO200055318-A2. 
XX 

PD 21-SEP-2000. 
XX 

PF 15-MAR-2000; 2000WO-IB000532 . 
XX 



PR 15-MAR-1999; 99US-0124702P . 

PR 08-JUN-1999; 99US-0138048P . 

PR 17-JUN-1999; 99US-013 9600p'. 

PR 01-SEP-1999; 99US-0151977P . 
XX 

PA (UYBR-) UNIV BRITISH COLUMBIA. 

PA (XENO-) XENON BIORESEARCH INC. 

XX r f 

PI Hayden MR, Wilson AR, Pimstone SN; 

XX 

DR WPI; 2000-587528/55. 
XX 

PT New ABC1 polypeptide is useful for treating diseases associated with ABC1 

PT biological activity, e.g. Alzheimer's disease, Huntington's disease and 

PT cancer. 
XX 

PS Example; Page; 229pp; English. 
XX 

CC The invention relates to the human ABC1 cholesterol transporter protein 

CC (B38082) and to nucleic acid sequences (C69120) which encode it. ABC1 is 

CC a member of the ATP-binding cassette (ABC transporter) superfamily of 

CC proteins, and plays a crucial role in cholesterol transport, particularly 

CC intracellular cholesterol trafficking in monocytes and fibroblasts, being 

CC involved in cholesterol efflux from the cell. The gene encoding ABC1 is 

CC located on chromosome 9q31, and mutations in this gene are associated 

CC with two genetic HDL (high density lipoprotein) deficiency disorders, 

CC Tangier disease (TD) and familial HDL deficiency (FHA) . These diseases 

CC are distinguishable in that TD is an autosomal recessive disorder, while 

CC FHA is inherited as an autosomal dominant trait. Low levels of HDL ("good 

CC cholesterol") in the blood correlate with a high risk of cardiovascular 

CC disease, particularly coronary artery disease, but also cerebrovascular 

CC disease, coronary restenosis, and peripheral vascular disease. 

CC Conversely, a high level of HDL has protective effects against 

CC cardiovascular disease. The invention provides genetic constructs and 

CC transgenic cells and non-human animals comprising human ABC1 nucleic 

CC acids, and methods of gene therapy for the treatment or prevention of 

CC cardiovascular disease comprising the administration of an expression 

CC vector encoding ABC1 or an active fragment thereof. The invention also 

CC encompasses compounds which mimic ABC1 activity, compounds which 

CC stimulate ABC1 expression and methods of screening for such compounds. It 

CC further relates to methods for determining whether a patient has an 

CC increased risk for cardiovascular disease due to polymorphisms in the 

CC ABC1 gene. Human ABC1 proteins and nucleotides can be used to treat or 

CC prevent cardiovascular disease, especially coronary artery disease, 

CC cerebrovascular disease, coronary restenosis or peripheral vascular 

CC disease. They may also be used in the treatment of diseases associated 

CC with ABC1 biological activity, such as Alzheimer's disease, Niemann-Pick 

CC disease, Huntington's disease, X-linked adrenoleukodystrophy and cancer. 

CC The invention specifically excludes proteins with the exact amino acid 

CC sequences of GenBank Accession No: CAA10005.1 and X75926, and the nucleic. 

CC acid with the exact sequence as GenBank Accession No: AJ012376.1. The 

CC present sequence represents a mutant human ABC1 cholesterol transporter 

CC associated with an altered cholesterol level and therefore an altered 

CC risk of cardiovascular disease. Note: The present sequence is not shown 

CC in the specification, but is derived from the native human ABC1 shown on 

CC pages 152-157 



SQ Sequence 2261 AA; 



Query Match 33.5%; Score 4241.5; DB 3; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 2.1e-307; 

Matches 1000; Conservative 346; Mismatches 729; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MP'SA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I :| I : : | : 
Db 65 GTLPWVQGI ICNANNPCFRYPTPGEAPGVVGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR--QHLEA1LSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : I I I : : I : : ! I I II I I I I I 

Db 121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNET FSG FLYHNLSLP 165 



Qy 174 NSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : | | : : 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II I | : : : | : : | : I I : : I : 

Db 205 — QL GDQEVSELCGLPKEKLAAAE RVLRSNMDI 235 

Qy 294 AK- VSQQLGLDAPNGS DS S PQAP P PRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKP I LRTLNSTS PFPSKELAEA — TKTLLHSLGTLAQELFSMRSWS DMRQEVMFLTNVNS 293 

Qy 342 ALALLL PQGACTGRT PGP PAS GAGGAAN GTGAGAVMGPNATAEEGAPSAAALATP 396 

= = I : III : I : I I I I : II 

Db 294 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 

Qy 397 DTLQGQCSAFVQ— LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I I: :. :| hhl I 
Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 452 HLMTSNPKI LYAPAGSEVDRVI LKANETFAFVGN VTHYAQVWLNI SAEI RS FLEQGRLQQ 511 

I I I I I : I : : I : I I : : I : I : I : I : I : 

Db 382 KILYTPDTPATRQVMAEVNKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

Ml I I I I I I I : I I : I : I 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFIAKHPEDVQSSNGSVYTWREAFNETN — 492 

Qy 548 LPS GMALLQQLDT I DNAACGWI QFMS KVS VDI FKGFPDEES I VN YTLNQAYQDNVTVFAS 607 

I : I I : I I | : I : : | : : | : | 

Db 4 93 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME— LLDERKFWAG 535 

Qy 608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

::| M II II MM : |:||:|: II III I II : 

Db 536 I VFTGITPGS I ELPHHVKYKI RMDIDNVERTNKI KDGYWDPGPRADPFEDMRYVWGGFAY 595 

Qy 663 I QDMME RAI IDT FVGH D WE P G S YVQMF P Y P C YT RDD FL FVI EHMMP L CMVI S WVY S VAM 722 

: M : : I : M I I : : I I : I I I I I I Mil: III I : : I : I I I I : 
Db 596 LQDVVEQAIIRVLTGTE-KKTGVYMQQMPYPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 654 



723 TIQHIV7VEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I: II III : I I I : I : : I : I I I : : I :| | III I :| :| 

655 IIKGIVYEKE7VRLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

783 WI I WL FLAVYAVAT I M FC FL VS VL Y S KAKLAS AC GG 1 1 YFL S YVP YM YVAI RE EVAH D K 842 ■ 

: : : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
715 PSWFVFLSVFAWTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

84 3 I TAFE- KCI AS LMSTTAFGLGS KYFALYEVAGVGI QWHT FSQS PVEGDDFNLLLAVTMLM 901 

I I 111:1 I I I I : I I I I : I I : I : I I : I I I I I I I I : I : I : : 
77 0 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

902 VDAVVYGILTWYIEAWPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I I I I I I I I I I I I : I I I I I I I I I I I I I : I I I : I : I 

830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS— .884 

962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I I I I I I I I I I : lllhl I : I : : I : I I II I : 

885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 

I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I l||:|| I I : I I I : I I I | 

930 S FLGHNGAGKTTTMS I LTGLFPPTS GTAYI LGKDI RS EMSTI RQNLGVC PQHNVLFDMLT 98 9 

1082 VEEHLWFYS RLKSMAQEEI RREMDKMI EDLEL- SNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 
M I I : I I I : I I I :::::: I I :: I I : I I : I | I I I I I : I I I I I I : I I I I 

990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 

1141 GSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

: I I I I II I I I II I : I I I I : I : I I | : I I I I : II I I I I I I I I : I I I I I I I I I II I I 
1050 GS KWI LDEPTAGVDP YS RRGIWELLLKYRQGRT 1 1 LSTHHMDEADVLGDRIAI I SHGKL 1109 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I I I I I I I I I I I : I : | | | 

1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : I Mill: III I I : I : I I I I I I : I I I II : : 
1170 DHESDTLTIDVS — AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

1299 S LDALHLS S FGLMDTTLEEVFLKVS EEDQS LENS EADVKES RKDVLPGAEGPAS GEGHAG 1358 

I I : I I : I : : I I I I I : I I I I : I I I I I : I I 
1228 RLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTLP- 1267 

1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

II : I : I II |: ||: :: : 

12 68 ARRNRRA -FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 1474 

I : M : I : I I : : I | : | | | | | | I I I : | I : I I : I I I I I I : I : 
1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

14 75 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I : I : : 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE — DTGTLELLNAL 1408 



1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

Ml: :| |: 
1409 TKDPGFGTRCM EGNPT 1424 

1594 P S PAP S D S PAS P DEDLQAWNVS L P PTAG P EMWT SAP S L P RLVRE PVR C 1641 



1425 --; PD TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 1463 

1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II II III: I I I I I : I I I : I : I I : I : 

14 64 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

1691 RLH RYGAI T FG — NVLKSI PAS FGT RAP PMVRK 1721 

I I I : I I I : I :: I 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

1722 IAVRR7UVQVFYNNKGYHSMPTYLNSLNNAILR7^FLPKSKGNPAAYGITVTNHPMNKTSAS 1781 

: I : I : : I I I I : I : : : : I I : I I I I I I I I I I : I I : I I I I I I I : I I 
1584 L DT RNN VKVW FNN KGWHAI S S FLN VI NN AI L RAN LQ KG E~N P S H YGI T AFNH P LN LT KQQ 1642 

1782 LS-LDYLLQGTDWIAIFIIVAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 
M : : I : : : I : I I I I I ! I I I I I I I I : I : : I I I I I I I : I I I : I I I I : I 

1643 L S E VALMT T S VD VL VS I C VI FAMS FVP AS FWFL I Q E RVS KAKH LQ F I S G VK P VI YW L S N 1702 

18 41 YVWDMLN YLVPATCCVI I LFVFDLPAYTS PTNFPAVLSLFLLYGWS ITPIMYPAS FWFEV 1900 

: I I I I I I : I I I I : I I I MINI: I I I I I I I I M : I I I I I I I : : 
1703 FVWDMCNYWPATLVI 1 1 FICFQQKS YVS STNLPVLALLLLLYGWS ITPLMYPAS FVFKI 1762 

1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

11:1111 : M II I I : I I I I : I : I I I I : I I I I. I I I I I : : I I I I : : 
1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

1961 MAYN E Y I N E Y YAK I GQ FD KMK S P FEWD I VT RGLVAMAVEG WG FL LT I MC Q YN FL RRP Q R 2020 

I I : : : : I : : : II I I : I I I I I I I I I II I I : I :: I I I I I : 
1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

2021 MPVSTKPVED-DVDVASERQRVXRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I: I I II Nihil ll:::|: |||:|: : I ||||:|:|: 
18 81 VNAKLS PLNDEDEDVRRERQRI LDGGGQNDI LEI KELTKI YRRK RKPAVDRICVGIP 1937 

2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I : I I I I I I I I I : I I : M : I : I : I : :| |::||||| ||: 
1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

-Mill::: III: I : : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

2200 ALIGYPAFIFLDEPTTGMDPKAI^RFLWNLILDLIKTGRSVVLTSHSMEECEALCTRLAIM 2259 

I I I I I : I I I I I I I I I M I I I I I I I I I ::| I I I I I I I I I I I I I I I I I I I I : I I I 
2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSVVKEGRSVVLTSHSMEECEALCTRMAIM 2117 

2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I : :|| || I I :: I I I : I : I 
2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 



2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 



Db 



Ml I I I I : : I I : I I I II I I I I I I I I I I I I I I I I III: 
2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



RESULT 14 
AAB38117 

ID AAB38117 standard; protein; 2261 AA. 
XX 

AC AAB38117; 
XX 

DT 29-JAN-2001 (first entry) 
XX 

DE Human ABC1 cholesterol transporter mutant, I883M. 
XX 

KW Human ABC1 cholesterol transporter; chromosome 9q31; 

KW ATP-binding cassette; HDL deficiency disorder; high density lipoprotein; 

KW Tangier disease; TD; familial HDL deficiency; FHA; polymorphism; 

KW cardiovascular disease; coronary artery disease; coronary restenosis; 

KW cerebrovascular disease; peripheral vascular disease; 

KW Alzheimer's disease; Niemann-Pick disease; Huntington's disease; 

KW X-linked adrenoleukodystrophy ; cancer; gene therapy; genetic diagnosis; 

KW prognosis; prophylaxis; drug screening; transgenic animal; mutant; 

KW mutein. 

XX 

OS Homo sapiens. 
XX 

PN WO200055318-A2. 
XX 

PD 21-SEP-2000. 
XX 

PF 15-MAR-2000; 2000WO-IB000532 . 
XX 

PR 15-MAR-1999; 99US-0124702P . 

PR 08-JUN-1999; 99US-0138048P . 

PR 17-JUN-1999; 99US-0139600P . 

PR 01-SEP-1999; 99US-0151977P . 
XX 

PA (UYBR-) UNIV BRITISH COLUMBIA. 

PA (XENO-) XENON BIORESEARCH INC. 
XX 

PI Hayden MR, Wilson AR, Pimstone SN; 
XX 

DR WPI; 2000-587528/55. 
XX 

PT New ABC1 polypeptide is useful for treating diseases associated with ABC1 

PT biological activity, e.g. Alzheimer's disease, Huntington's disease and 

PT cancer. 
XX 

PS Example; Page; 229pp; English. 
XX 

CC The invention relates to the human ABC1 cholesterol transporter protein 

CC (B38082) and to nucleic acid sequences (C69120) which encode it. ABC1 is 

CC a member of the ATP-binding cassette (ABC transporter) superfamily of 

CC proteins, and plays a crucial role in cholesterol transport, particularly 

CC intracellular cholesterol trafficking in monocytes and fibroblasts, being 

CC involved in cholesterol efflux from the cell. The gene encoding ABC1 is 

CC located on chromosome 9q31, and mutations in this gene are associated 



CC with two genetic HDL (high density lipoprotein) deficiency disorders, 

CC Tangier disease (TD) and familial HDL deficiency (FHA). These diseases 

CC are distinguishable in that TD is an autosomal recessive disorder, while 

CC FHA is inherited as an autosomal dominant trait. Low levels of HDL ("good 

CC cholesterol") in the blood correlate with a high risk of cardiovascular 

CC disease, particularly coronary artery disease, but also cerebrovascular 

CC disease, coronary restenosis, and peripheral vascular disease. 

CC Conversely, a high level of HDL has protective effects against 

CC cardiovascular disease. The invention provides genetic constructs and 

CC transgenic cells and non-human animals comprising human ABC1 nucleic 

CC acids, and methods of gene therapy for the treatment or prevention of 

CC cardiovascular disease comprising the administration of an expression 

CC vector encoding ABC1 or an active fragment thereof. The invention also 

CC encompasses compounds which mimic ABC1 activity, compounds which 

CC stimulate ABC1 expression and methods of screening for such compounds. It 

CC further relates to methods for determining whether a patient has an 

CC increased risk for cardiovascular disease due to polymorphisms in the 

CC ABC1 gene. Human ABC1 proteins and nucleotides can be used to treat or 

CC prevent cardiovascular disease, especially coronary artery disease, 

CC cerebrovascular disease, coronary restenosis or peripheral vascular 

CC disease. They may also be used in the treatment of diseases associated 

CC with ABC1 biological activity, such as Alzheimer f s disease, Niemann-Pick 

CC disease, Huntington's disease, X-linked adrenoleukodystrophy and cancer. 

CC The invention specifically excludes proteins with the exact amino acid 

CC sequences of GenBank Accession No: CAA10005.1 and X75926, and the nucleic 

CC acid with the exact sequence as GenBank Accession No: AJ012376.1. The 

CC present sequence represents a mutant human ABC1 cholesterol transporter 

CC associated with an altered cholesterol level and therefore an altered 

CC risk of cardiovascular disease. Note: The present sequence is not shown 

CC in the specification, but is derived from the native human ABC1 shown on 

CC pages 152-157 
XX 

SQ Sequence 2261 AA; 

Query Match 33.5%; Score 4240.5; DB 3; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 2.5e-307; 

Matches 1000; Conservative 345; Mismatches 730; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKN VTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPT I SVKEVP FYTAAPLT SA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 



Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I : : I : 

Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : | | | : : | : : I I I I I I I I I I 

Db 121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNET FS G FLYHNLSLP 165 

Qy 174 NSTAQALLAARVI)PPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : I I : : 

Db 166 KSTVDKMLRADV 1 LHKVFLQGYQLHLT S- LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II I | : : : | : I : I I : : I : 

Db 205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 



Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLT^^VNS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GTGAGAVMGPNAT AEEGAP S AAALAT P 396 

: : I : III : I : I I I I : I I 

Db 2 94 SSSSTQI YQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 

Qy 397 DTLQGQCSAFVQ — LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I : : I : : : I hhl I 

Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 452 HLMTSNPKILYAPAGSEVDRVI LKANETFAFVGNVTHYAQVWLNISAEIRSFLEQGRLQQ 511^ 

Mill : | : :. | : | | : : I : I : I : I : I : 

Db 382 KILYTPDTPATRQVMAEVNKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 43-4 

Qy 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

: I I I II I III : I I : I : I 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN — 492 

Qy 54 8 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : I I : | | I : : : : I : : I : : I : I 

Db 4 93 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME — LLDERKFWAG 535 

Qy 608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

: : I II I I I I I I I I : I : I I : I : I I I I I I I I : 

Db 536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMRYVWGGFAY 595 

Qy 663 IQDMMERAIIDTFVGHDVVEPGSWQMFPYPCYTRDDFLFVIEHMMPLCJWISWWSVT^ 722 

: I I : : I : I I I I : : I I : I I I I I I III): I I I I : : I : I I I I : 

Db 596 LQDWEQAI I RVLT GT E ~ K KT GVYMQQMP Y P C YVD D I FL RVMS R SMP L FMT LAW I Y S VAV 654 

Qy 723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I : I I I I I I I I I I : I I I : I : : I : i I I : : I : I | I I I I : I : I 
Db 655 IIKGIVYEKEARLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

Qy 783 WT I W L F LAVYAVAT I M FC FL VS VL Y S KAK LAS AC GGIIYFLS YVP YM YVAI RE E VAH D K 842 

: : : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
Db 715 PSWFVFLSVFAVVTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

Qy 843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I I I I : I I I I I : I I I I : I I : I : I I : I I I I I I I I : I : I : : 
Db 770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

Qy 902 VDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 . 

I : I I : : I I I I I I I I I I I : I I I I I I I I I I I I hill: hi 
Db 830 FDTFLYGVMTWYI EAVFPGQYGI PRPWYFPCTKS YWFGE ESDEKSHPGSNQKRMS— 884 

Qy 962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKL7VLNKLSLNLYENQW 1021 

: I I I I I II I I I I : I 111 = 1 h h : h I I I I I : 

Db 885 ETC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 



Qy 

Db 



1022 
930 



SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 

I I I I I I I I I I I I I I I I I I I I I I I I I h I I I I I h I I I I : I I h I I I I I I I I I II 

S FLGHNGAGKTTTMS ILTGLFPPTS GTAYI LGKDI RSEMST I RQNLGVCPQHNVLFDMLT 989 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 



1082 VEEHLWFYS RLKSMAQEEI RREMDKMI EDLEL- SNKRHS LVQTLSGGMKRKLSVAI AFVG 1140 
I I I I : I I I : I I I :::::: I I : : I I : I I : I I I I I I I : I I I I I I : I I I I 
990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 1049 

1141 GSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

II: : I I I I I I I I I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I I I 
1050 GSKWI LDEPTAGVDP YSRRGIWELLLKYRQGRT I I LSTHHMDEADVLGDRI AI I SHGKL 1109 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG --GPQEPGLAS 1238 

I I I I I I I I I I I I I I I : I : I I I 

1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : I Mill: II I .11:1:11 I I I I : I I I II : :" 

1170 DHESDTLTIDVS — AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : I I I I I : I I I i : I I I I I : II 
1228 RLSDLGI SSYGI SETTLEEI FLKVAEE SGVDA-ETSDGTLP 1267 

135 9 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

II : I : I II |: ||: :: : 

1268 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 1474 

I : II : I : I I : : I I : I I I I I I I I I : | I : I I : I I I I I I : I : 

1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : | | : I : : 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 14 08 

1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: : I I : 
1409 TKDPGFGTRCM EGNPI 1424 

1594 P S PAP S D S PAS P DEDLQAWNVS L P PT AGP EMWT SAP S L P RLVRE PVR C 1641 

II | | | | | I : I I : I : : : : I 

1425 PD TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 1463- 

1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II II III: I I I I I : I I I : I : I I : I : 

1464 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

1691 RLHRYGAI T FG~ -NVLKS I PAS FGT RAP PMVRK 1721 

111:11 I: I ::! 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

1722 I AVRRAAQVF YNN KG YH SMP T YLN S LNNAI LRAN L P K S KGN PAAYG I T VTNH PMNKT S AS 1781 

: I : I : : I II I : I : : : : I I : I I I I I I I I I I : I I : I I I I I I I : I I 
1584 LDTRNWKVWFNNKGWHAI S S FLNVINNAI LRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

1782 LS-LDYLLQGTDWIAIFIIVAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 

II : : I I : : : I : I I I I I I I I I I I I I I : I : : I I I I I I I : I I I : I I I I : I 

1643 L S EVALMT T S VDVLVS I CVI FAMS FVPAS FWFL I Q ERVS KAKH LQ F I S GVK P VI YWL S N 1702 

1841 YVWDMLNYLVPATCCVIILEVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 



Db 



: I I I I I I : I I I I : I I I : I I I I I : I I I I I I I I I I : I I I I I I I : : 
1703 FVWDMCNYWPATLVI 1 1 FI CFQQKS YVS STNLPVLALLLLLYGWS ITPLMYPAS FVFKI 1762 



Qy 1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

I I : I I I I : I I I I I I : I I I I : I : I I I I : I I I I I I I I I : : I I I I : : 

Db 1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : : : II I I : I I I I I I I II II I I : I : : I I I I I : 
Db 1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 18 80 

Qy 2021 MP VS T K P VE D- D VD VAS E RQ RVL RGDADN DMVK I EN LT KVY KSRKIGRI LAVD RL C L GVR 2079 

: 1:1111 Nihil I I : : : I : I I I : I : : I I I I I : I : I : 
Db 1881 VNAKLSPLNDEDEDVRRERQRILDGGGQNDILEIKELTKIYRRK RKPAVDRICVGI P 1937 

Qy 2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I : I I I I I I I I I : I I : I I : I : I : I : : I I : : I I I I I I I : 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 214 0 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: I I I I I : : : III: I : : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAI RKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 

I I I I I : I I I II I I I I I I I I I I I I I I I : : I I I I I I I I I I I I I I I I I I I I I : I I I 
Db 2058 ALIGGPPVVFLDEPTTGMDPKARRFLWNCALSVVKEGRSVVLTSHSMEECE7\LCTRMAIM 2117 

Qy 2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I I II I I : : I I I : I : I 
Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

III I I I I : : I I : I I I I I I 1. 1 I I I I I I I I I I I I I III: 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



RESULT 15 
AAB38114 

ID AAB38114 standard; protein; 2261 AA. 
XX 

AC AAB38114; 
XX 

DT 29-JAN-2001 (first entry) 
XX 

DE Human ABC1 cholesterol transporter mutant, E1172D. 
XX 

KW Human ABC1 cholesterol transporter; chromosome 9q31; 

KW ATP-binding cassette; HDL deficiency disorder; high density lipoprotein; 

KW Tangier disease; TD; familial HDL deficiency; FHA; polymorphism; 

KW cardiovascular disease; coronary artery disease; coronary restenosis; 

KW cerebrovascular disease; peripheral vascular disease; 

KW Alzheimer's disease; Niemann-Pick disease; Huntington's disease; 

KW X-linked adrenoleukodystrophy ; cancer; gene therapy; genetic diagnosis; 

KW prognosis; prophylaxis; drug screening; transgenic animal; mutant; 

KW mutein. 

XX 

OS Homo sapiens. 



XX 

PN WO200055318-A2. 
XX 

PD 21-SEP-2000. 
XX 

PF 15-MAR-2000; 2000WO-IB000532 . 
XX 

PR 15-MAR-1999; 99US-0124702P . 

PR 08-JUN-1999; 99US-013804 8P . 

PR 17-JUN-1999; 99US-0139600P . 

PR 01-SEP-1999; 99US-0151977P . 
XX 

PA (UYBR-) UNIV BRITISH COLUMBIA. 

PA (XENO-) XENON BIORESEARCH INC. 
XX 

PI Hayden MR, Wilson AR, Pimstone SN; 
XX 

DR WPI; 2000-587528/55. 
XX 

PT New ABC1 polypeptide is useful for treating diseases associated with ABCl 

PT biological activity, e.g. Alzheimer's disease, Huntington's disease and 

PT cancer. 
XX 

PS Example; Page; 229pp; English. 
XX 

CC The invention relates to the human ABCl cholesterol transporter protein 

CC (B38082) and to nucleic acid sequences (C69120) which encode it. ABCl is 

CC a member of the ATP-binding cassette (ABC transporter) superfamily of 

CC proteins, and plays a crucial role in cholesterol transport, particularly 

CC intracellular cholesterol trafficking in monocytes and fibroblasts, being 

CC involved in cholesterol efflux from the cell. The gene encoding ABCl is 

CC located on chromosome 9q31, and mutations in this gene are associated 

CC with two genetic HDL (high density lipoprotein) deficiency disorders, 

CC Tangier disease (TD) and familial HDL deficiency (FHA) . These diseases 

CC are distinguishable in that TD is an autosomal recessive disorder, while 

CC FHA is inherited as an autosomal dominant trait. Low levels of HDL ("good 

CC cholesterol") in the blood correlate with a high risk of cardiovascular 

CC disease, particularly coronary artery disease, but also cerebrovascular 

CC disease, coronary restenosis, and peripheral vascular disease. 

CC Conversely, a high level of HDL has protective effects against 

CC cardiovascular disease. The invention provides genetic constructs and 

CC transgenic cells and non-human animals comprising human ABCl nucleic 

CC acids, and methods of gene therapy for the treatment or prevention of 

CC cardiovascular disease comprising the administration of an expression 

CC vector encoding ABCl or an active fragment thereof. The invention also 

CC encompasses compounds which mimic ABCl activity, compounds which 

CC stimulate ABCl expression and methods of screening for such compounds. It 

CC further relates to methods for determining whether a patient has an 

CC increased risk for cardiovascular disease due to polymorphisms in the 

CC ABCl gene. Human ABCl proteins and nucleotides can be used to treat or 

CC prevent cardiovascular disease, especially coronary artery disease, 

CC cerebrovascular disease, coronary restenosis or peripheral vascular 

CC disease. They may also be used in the treatment of diseases associated 

CC with ABCl biological activity, such as Alzheimer's disease, Niemann-Pick ^ 

CC disease, Huntington's disease, X-linked' adrenoleukodys trophy and cancer. 

CC The invention specifically excludes proteins with the exact amino acid 

CC sequences of GenBank Accession No: CAA10005.1 and X75926, and the nucleic 



CC acid with the exact sequence as GenBank Accession No: AJ012376.1. The 

CC present sequence represents a mutant human ABC1 cholesterol transporter 

CC associated with an ■ altered cholesterol level and therefore an altered 

CC risk of cardiovascular disease. Note: The present sequence is not shown 

CC in the specif ication, but is derived from the native human ABC1 shown on 

CC pages 152-157 
XX 

SQ Sequence 2261 AA; 

Query Match 33.5%; Score 4240.5; DB 3; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 2.5e-307; 

Matches 1000; Conservative 345; Mismatches 730; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I : : I : 
Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : I I | : : | : : I I I I I I I I I I 

Db 121 SMKDMRKVLRTLQQIKKSSSNLKLQDFLVDNETFSG FLYHNLSLP 165 

Qy . 174 NSTAQALI^ARVT)PPEVTHLLFGPSSALDSQSGLHKGQEPW 233 

II : I I I .: I : I I III : I I : : 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II | | : : : | : | : | | : : | : 

Db 205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD ■ AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GTGAGAVMGPNATAEEGAP SAAALATP 396 

: : I : III : I : I I I I : II 

Db 294 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 

Qy 397 DTLQGQCSAFVQ — LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I :: I : : : I I : I : I I 
Db 354 YCNDLMKNLESSPLSRI IWKALKPLLVG 381 

Qy 452 HLMT SN P K I L YAP AGS E VDRVI LKANET FAFVGNVTH YAQVWLN I S AE I RS FLEQGRLQQ 511 

I I II I : | : : | : | | : : | : | : | : | : | : 

Db 382 KILYTPDTPATRQVMMWKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

: I I I I I I I I I : I I : I : I 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN-- 492 

Qy 548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

1:11 :ll I::: : | ::| :: I :| 

Db 4 93 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME — LLDERKFWAG 535 



Qy 608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

::| II II II MM : |:||:|: M Ml I II : 

Db 536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMRYVWGGFAY 595 

Qy 663 IQDMMERAI I DTFVGHDWEPGS YVQMFP YPCYTRDDFLFVI EHMMPLCMVI SWVYSVAM 722 

: M : : I : M I I : : I I : I Mill I M I : M I I : : I : M I I : 
Db 596 LQDWEQAIIRVLTGTE-KKTGVYMQQMPYPCWDDIFLRVMSRSMPLFMTLAWIYSVAV 654 

Qy 723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I ' : M M I M M I : II I : I : : I : M I : : I : I I I I I I : I : I 
Db 655 IIKGIVYEKEARLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

Qy 783 WI I W L FLAVYAVAT I MFC FLVS VL YS KAKLAS ACGG 1 1 YFL S YVP YMYVAI RE EVAH D K 842 

:: : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
Db 715 PSWBVFLSVFAVVTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

Qy 843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I I I I : I I II I : I I I I : I I : I : I I : I I I I I I I I : I : I :: 
Db 770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

Qy 902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I :: II II I II II II : I I I II II Mill Mill: Ml 
Db 830 FDTFLYGVMTWYI EAVFPGQYGI PRPWYFPCTKS YWFGE ESDEKSHPGSNQKRIS — 884 

Qy 962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I II I I II I I I : I II I : I I : I : : I : I I I I I : 

Db 885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

Qy 1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 

I M I M I M II I II I I II I I II II II : I I I I I I: I I I I: I I M I I I II II II I I 
Db 930 S FLGHNGAGKTTTMS I LTGLFPPTSGTAYI LGKDI RSEMSTI RQNLGVCPQHNVLFDMLT 989 

Qy 1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

I I I I - I I M I II : : : : : : I I : : I I : I Ml I M I I I : I I I II I : I I I I 

Db 990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 1049 

Qy 1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

IM : I I II I I I I I I II : II I I : I : I I I : I II M I I I II I I I I M I I I I I II M I I I I 
Db 1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

Qy 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I II I I I I I I I II I : I : I I I 

Db 1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

Qy 1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : I II II M III I I : M I I I II I : I I I II : : 
Db 1170 DHDSDTLTIDVS— AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

Qy 1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I :IMM MIMMMIMM I I M II 
Db 1228 RLSDLGI S SYGI SETTLEEI FLKVAEE SGVDA-ETSDGTLP 1267 

Qy 1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

I I : I : I II MM:::: 

Db 1268 :ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

Qy 1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 1474 



Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I : I I : I : I I :: I. I : I I I I I I | | | : | I : I I : I I I | I | : j : 
1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 



1363 



1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

•I II I I I I I I: II : : : I I : I : : 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 14 08 

1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

Ml: :| |s 
1409 TKDPGFGTRCM EGNPI 1424 



1641 



1594 P S PAP S D S PAS P D E D LQAWN VS L P P TAG P EMWT SAP S L P RL VRE PVR C 

II I I I I I I : I I : I : : : : | 

1425 PD TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 14 63 



1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: I I II III: I I I I I : I I I : I : I I : I : 

1464 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNT7VDILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

1691 RLHRYGAI T FG — NVLKS I PAS FGTRAP PMVRK 1721 

111:11 I : I : : I 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 



1722 



1584 



IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 17 81 
: I : I : : I I I I : I : : : : I I : I I II | | | | | | : | | : | M I I I I : I I 
LDTRNNVKWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 



1782 LS-LDYLLQGTDVVIAIFIIVAMSFVPAS FVVFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 

N: : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I I II : I' I I : I I I I : I 

1643 LS EVALMTT SVDVLVS ICVI FAMS FVPAS FWFLIQERVS KAKHLQFI S GVKPVI YWLSN 1702 



1841 



1703 



1901 



YVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 
: I I I I I I : I I II : I I I : I I I I I : I I I I I I I I I I : I I I I I | | : : 
FVWDMCNYWPATLVI 1 1 FI CFQQKS YVS STNLPVLALLLLLYGWS ITPLMYPAS FVFKI 1762 



PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 
11:111 I : I I J f I I :IMI:|:|| I I :| I I I I I I I I :: I I I I : : 
1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 



2020 



1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 

I I : : : : I : : : II I I : I I I I I I I I I I I I I : I :: I I I I I : 
1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

2021 M P VS T K P VE D- D VD VAS E RQ RVL RGD ADN DMVKI EN LT KVYK S RK I G RI LAVD RLC L G VR 2079 

I : I I M I I I I : I I I | : : : | : | | I : I : : I I I I I : I : I : 
1881 VNAKLSPLNDEDEDVRRERQRILDGGGQNDILEIKELTKIYRRK RKPAVDRICVGIP 1937 

2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I : I M I I I I I I : | I : I I : I : I : I : :| |::||||| ||: 
1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: M IN:: : III: |: : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAI RKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 
Mill :MI III II Ml III III II I : : I I I I I I I I I I I I I I I I I I I I I : I I I 



Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 

Qy 2260 WGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I! I: :| I II II ::|||:| :| 
Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

III I I I I : : I I : I I I I I I I I I I I I I I I I I I I II III: 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



Search completed: September 1, 2004, 10:52:37 
Job time : 223 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



September 1, 2004, 10:49:12 ; Search time 45 Seconds 

(without alignments) 
2794.686 Million cell updates/sec 



Title: 



Sequence : 



US-10-088-467-2 



Perfect score: 12668 



1 MGFLHQLQLLLWKNVTLKRR GLISFEEERAQLSFNTDTLC 2436 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



389414 seqs, 51625971 residues 



Total number of hits satisfying chosen parameters: 



389414 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing : 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Issued_Patents_AA: * 

/cgn2_6/ptodata/2/iaa/5A_COMB.pep:* 
/cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 
/cgn2_6/ptodata/2/iaa/6A_COMB.pep: * 
Vcgn2_6/ptodata/2/iaa/6B_COMB.pep: * 
/cgn2_6/ptodata/2/iaa/PCTUS_COMB . pep : 
/cgn2_6/ptodata/2/iaa/backfilesl.pep: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. . 



SUMMARIES 

% 

Result Query 



No. 


Score 


Match 


Length 


DB 


ID 








Description 


1 


7046 


55. 


6 


1457 


3 


US- 


08- 


•665- 


259-27 


Sequence 


27, Appl 


2 


7046 


55. 


6 


1457 


3 


US- 


08- 


-762- 


500-27 


Sequence 


27, Appl 


3 


4240.5 


33. 


5 


2261 


4 


US- 


09- 


■526- 


193A-1 


Sequence 


1, Appli 


4 


3173.5 


25. 


1 


1375 


3 


US- 


08- 


665- 


259-26 


Sequence 


26, Appl 


5 


3173.5 


25. 


1 


1375 


3 


us- 


08- 


762- 


500-26 


Sequence 


26, Appl 


6 


2622 


20. 


7 


1684 


3 


us- 


08- 


665- 


259-25 


Sequence 


25, Appl 


7 


2622 


20. 


7 


1684 


3 


us- 


08- 


762- 


500-25 


Sequence 


2 5, Appl 


8 


2622 


20. 


7 


1704 


3 


us- 


08- 


762- 


500-75 


Sequence 


75, Appl 


9 


363.5 


2. 


9 


589 


4 


us- 


09- 


328- 


352-7592 


Sequence 


7592, Ap 


10 


360 


2. 


8 


317 


4 


us- 


09- 


489- 


039A-10626 


Sequence 


10626, A 


11 


359.5 


2. 


8 


345 


4 


us- 


09- 


252- 


991A-31957 


Sequence 


31957, A 



12 


343. 5 


2 . 


7 


532 


4 


US- 


-09- 


-543- 


-681 A-464 6 




4 64 6 An 


13 


341.5 


2 . 


7 


788 


4 


us- 


09- 


-252- 


ZJ Zs JL-T\ 6Q1 / 1 




?ft 1 71 A 


14 


339 . 5 


2 . 


7 


315 


4 


us- 


-09- 


-134- 


-000C-644 9 




64 4 9 An 


15 


338. 5 


2 . 


7 


607 


4 


us- 


-09- 


-252- 


.QQ1 7\-1 




18351 A 


16 


337.5 


2 . 


7 


315 


4 


us- 


-09- 


-32 8- 


-^S?-4 ^ftft 




4 1 ft ft An 


17 


336 


2 . 


7 


309 


4 


US- 


-09- 


7 S9- 


.QQi A-91 904 




91904 A 


18 


334 


2 . 


6 


335 


4 


us- 


-09- 


-252- 


-QQI A-?0ft ^7 




?0R^7 A 


19 


333 


2 . 


6 


588 


4 


US- 


-09- 


-489- 


W O Z/ 1\ 1 J J / J 




1 1S79 A 


20 


330. 5 


2 . 


6 


922 


4 


us- 


-09- 


-4RQ- 

*± O _7 


-O^QA-ftQ^ft 




O Zf o o ^ /-vlj 


21 


330 


2 . 




594 


4 


US- 


09- 


■ S4 ^- 


OOln JJ^O 




RS9R An 


22 


328 


2 . 


6 


332 


4 


us- 


-09- 


107- 






^ 7 S ? An 


23 


327 . 5 


2 . 




304 


4 


us- 


•09- 


1 07- 

X \J 1 


S^9 A- S4 94 




S494 An 


24 


327 


2 . 


6 


291 


4 


us- 


-09- 


107- 


-532A-4205 




4 ? 0 S An 


25 


327 


2 . 


6 


929 


4 


US- 


09- 


252- 


QQI A-??94 6 

-7 ZJ ±A U 




99Q46 A 


26 


326. 5 


2 . 


6 


323 


4 


us- 


09- 


4 8 9- 


n^QA- 194 Q6 




194 Q6 A 


27 


322 . 5 


2 . 


5 


322 


4 


TTS- 


\J ZJ 


1 07- 

X U / 


S^9 A-4 669 


C? t~\ /^f t 1 /~\ /—* 

jc que rice 


Zl 669 Zir^ 


28 


316.5 


9 

• 




"31 6 


4 


us- 


09- 


543- 


f.Q i a- 61 ft 4 


Sequence 


61 A id Zi^ 


29 


315. 5 


2 . 


5 


1280 


2 


us- 


08- 


583- 


27 6-19 


Q c± n 1 1 


1 Q ZVr^r^i 1 
X _7 / X 


30 


315 


2 . 


5 


1279 


4 


us- 


09- 


672- 


ft 1 0-6 


oequeiice 


6 A f-N »"V "1 -| 

Q f -^PP J- J- 


31 


314 . 5 


2 . 


5 


1280 


4 


us- 


09- 


767- 






z , /\ppxx 


32 


314 .5 


2 . 


5 


1280 


4 


us- 


09- 


672- 


810-5 




5 , Appli 


33 


314 . 5 


2 . 


5 


1280 




5206352-4 




P 4- i-i Tl "t~ M/^i 


S9 06^ S9 


34 


314 


2 . 


5 


233 


4 


us- 


09- 


627- 


^76-1 9 


oequence 


1 9 7\-r>r~\ 1 

iz f /\pp± 


35 


312 . 5 


2 . 


5 


1 ?R0 


4 


us- 


09- 


672- 


qi D — 9 


Sequence 


9 Z\ nn T ■! 

Appj-l 


36 


312 . 5 


2 . 


5 


1283 


4 


us- 


09- 


672- 


ft 1 0-4 




*± r in.pp X X 


37 


312 


2 . 


5 


1279 


2 


us- 


08- 


784- 


64 9A-2 




9 Annl i 


38 


309 . 5 


2 . 


4 


402 


4 


us- 


09- 


107- 


S^9A-S^60 


oequence 


R^60 Ar^ 

ojdu, /\p 


39 


308 . 5 


2 . 


4 


391 


4 


us- 


09- 


252- 


^^^-9097^ 




9D97S A 


40 


307. 5 


2. 


4 


1280 


2 


us- 


08- 


752- 


447-2 


Qpfjnpri (-.p 

k-> ^Z- LA ^3 il^C 


9 Ann! i 


41 


307.5 


2. 


4 


1280 


4 


us- 


09- 


316- 


167-2 


Sequence 


2, Appli 


42 


307.5 


2. 


4 


1280 


4 


us- 


09- 


397- 


233-2 


Sequence 


2, Appli 


43 


304.5 


2. 


4 


231 


4 


us- 


09- 


134- 


001C-3824 


Sequence 


3824 f Ap 


44 


303.5 


2. 


4 


243 


4 


us- 


09- 


543- 


681A-5911 


Sequence 


5911, Ap 


45 


301.5 


2. 


4 


254 


4 


us- 


09- 


107- 


532A-4983 


Sequence 


4983, Ap 



ALIGNMENTS 



RESULT 1 

US-08-665-259-27 

Sequence 27, Application US/08665259 
Patent No. 6028173 
GENERAL INFORMATION: 

APPLICANT: Landes, Gregory M. 
APPLICANT: Burn, Timothy C. 
APPLICANT: Connors, Timothy D. 
APPLICANT: Dackowski, William R. 
APPLICANT: Van Raay, Terence J. 
APPLICANT: Klinger, Katherine W. 

TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, 

TITLE OF INVENTION: COMPOSITIONS, METHODS OF MAKING AND USING SAME 
NUMBER OF SEQUENCES: 73 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: G ENZYME CORPORATION 
STREET: One Mountain Road 
CITY: Framingham 



STATE: Massachusetts 
COUNTRY: United States of America 
ZIP: 01701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/665,259 
FILING DATE: 17-JUN-1996 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
. NAME: Dugan, Deborah A. 
REGISTRATION NUMBER: 37,315 
REFERENCE/ DOCKET NUMBER: IG5-9-1 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (508) 872-8400 
TELEFAX: (508) 872-5415 
INFORMATION FOR SEQ ID NO: 27: 
SEQUENCE CHARACTERISTICS : 
LENGTH: 1457 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-665-259-27 

Query Match 55.6%; Score 7046; DB 3; Length 1457; 

Best Local Similarity 94.2%; Pred. No. 0; 

Matches 1374; Conservative 22; Mismatches 59; Indels 4; Gaps 4; 

Qy 980 MEEEPTHLPLWCV^KLTKVTKDDKKLT^NKLSLNLYENQWSFLGHNGAGKTTTMSILT 1039 

'I I I I I I I I I I I I I I I I I I I I I I : I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 1 MEEEPTHLPLWCVT)KLTKWKNDKKLALNKLSLNLYENQVVS FLGHNGAGKTTTMS I LT 60 

Qy 1040 GLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEE 1099 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 GLFPPTSG SAT I Y GH D I RT EMD E I RKN L GMC P Q HN VL FD RLT VE E H LW F Y S RL K S MAQ E E 120 

Qy 1100 IRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYAR 1159 

I I : I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 IRKETDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYAR 180 

Qy 1160 RAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRL 1219 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 RAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGS PLFLKGAYXDGYRL 240 

Qy 1220 TLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYIL 1279 

I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 TLVKQPAEPGTSQEPGLASSPSGCPRLSSCSEPQVSQFIRKHVASSLLVSDTSTELSYIL 300 

Qy 12 80 PSEA7VKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKES 1339 

I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 PSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKES 360 



Qy 



1340 



RKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRP 



1399 



I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I : I I I I I I I 

Db 361 RKDVLPGAEGLTAVGGQAGNLARCSELAQSQASLQSASSVGSARGEEGTGYSDGYGDYRP 420 



Qv 


1400 


LFDN PQDPDNVS LQEVEAEALS RVGOGS RKI.DGGWT. WROFHGLLVKRFHCARRN S K AT.F 


1459 






1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 : : 1 1 1 1 1 1 II : 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

III! Illlllllll 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 ! 1 1 I 1 




Db 


421 


LFDNLQDPDNVSLQEAEMEALAQVGQGSRKLEGWWLKMRQFHGLLVKRFHCARRNSKALC 


480 


Qy 


1460 


SQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPY7\NEERREYRLR 


1519 






1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 • 1 1 1 1 1 

■< 1 > ■ > M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 I I | | | | | 1 | | | | | 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 • 1 1 1 1 1 




Db 


481 


SQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERQEYRLR 


540 


Qv 


1520 


LSPDASPOOLVSTFRLPSGVGATCVLKSPANGSTiGPTT.NT.SSGESRT.T.AARFFDSMrT.FS 


1 S7Q 






1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 

I ) I i I I I I i I I I I I I 1 I I I 1 I I I I | | | | | | J | | j j | II 1 1 M 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 




Db 


541 


LSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPMLNLSSGESRLLAARFFDSMCLES 


600 


Qv 


1580 


FTOGLPLSNEVPPPPSPAPSDSPASPDED-LOAW^A/ST,PPTAGPF.MWT C IAPST,PRT.VRFP 


JL \J -D O 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ' 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 
1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11111*111111111 1 1 1 1 1 1 1 1 1 I [ II 




Db 


601 


FTQGLPLSNFVPPPPSPAPSDSPVXPDEDSLQAWNMSLPPTAGPETWTSAPSLPRLVHEP 


660 


Ov 


1639 


VRCTCSAOGTGFSCPSSVGGHPPOMRVVTGDTT.TDTTnHMV^FYT T FT^DRFRT udypat 


1 £Qfl 

J. D i70 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
> ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I I I I I I I | 1 | I j | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


661 


VRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAI 


720 


Ov 


1699 


TFGNVTiKST PASFRTRAPPMVRKT AVRR A AOVFYMWK'^VH^MPTVT mot MM ATT DZiMT D V 
x x vjiN v xji\ij j. x t~lkj j_ o j. xvrvxr irj.*i v jaja J_i A v t\r\j-\Jn\/ v r X ja'ct x norl xr 1 X ±j1n »DijlNlN/\X LirvrMM xjr J\ 


i / JO 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i III 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 
• ■111 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 | | 1 




Db 


721 


TFGNVQKS I PAS FGAKVPPMVRKIAVRRVAQVLYNNKGYHSMPTYLNSLNNAI LRANLPK 


780 


Qv 


1759 


SKGNPAAYGITVTNHP^KTSASLSLDYLLQGTDVVIAIFIIVAMSFVPAS FWFLVAEK 


JL O _L O 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 
■ ■■■■ iii i i i i i i i i i i i i i i i [ i i i i i i i i i i i i i i i i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


781 


S KGN PAAYXITVTNHPMNKTSAS LS LD YLLQGTDWI AI FI I VAMS FVPAS FVVFLVAEK 


840 


Qv 


1819 


ST KAKH LO FVS GCN P 1 1 YWLAN YW DMLN Y 


X o / o 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

I I I I I I I M I I I I I I • | 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 I 1 1 1 II 1 1 1 1 1 1 1 




Db 


841 


STKAKHLQFVSGCNPVIYWlxANYWDM 


900 


Qy 


1879 


LFLLYGWSITPIMYPASFWFEVPSSAWFLIVINLFIGITATVATFLLQLFEHDKDLI<VV 


1938 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

!''!!''! 1 ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 1 | | | | | | | 1 | 




Db 


901 


LFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKW 


960 


Qv 


1939 


NSYLKSCFLI FPNYNLGHGj^EMAYNE 

■»* * >•►-' v — ■«- -u-i. J» A. ill lUiil 1 Alii. 1 » J— i X -L J. > i_l X X JTiV J. K-J^J X XJ L. Vl ll\fcj I X i j V w XJ JL V X X\\3 XJ V Jtii. Xl\W 


X -7 J O 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 j 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

■ 1 1 > 1 1 1 I 1 1 I 1 I I I 1 >) 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I i 1 1 1 1 1 




Db 


961 


NSYLKSCFLIFPNYNLGHGIJVIi^ 


1020 


Qy 


1999 


EGWG FLLT I MCQ YN FLRR PQRMP VS T K P VED DVDVAS ERQ RVL RG DADN DMVKIENLTK 


2058 






II III 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 

11 III 1 1 1 1 1 ' 1 1 1 1 1 * 1 1 1 ' 1 1 1 1 1 1 1 1 1 I 1 1 i 1 1 1 1 1 1 1 1 J 1 [ 1 I I I 1 | 1 | j | | | 1 




Db 


1021 


EGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQRVLRGDADNDMVKIENLTK 


1080 


Qy 


2059 


VTKSRKIGRIj^VT)RLCLGV-RPGECFGLLGWGAGKTSTFKMLTGDESTTGGEAFA^ 


?1 1 7 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11111111111111111111111111111111111111 




Db 


1081 


WKSRKIGRIIxAVT)RLCLGVCVPGECFGLLGWGAGKTSTFKMLTGDESTTGGFJ\FVNGH 


1140 


Qy 


2118 


SVXKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDFARWKWALEKLELT 


2177 






1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 I 




Db 


1141 


SVXKDLLQVQQSLGYCPQFDVPVTELTTVREHLQLYTRLRCIPWKDEAQWKWAXEKLELT 


1200 


Qy 


2178 


KYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGR 


2237 



1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 



Db 



1201 KYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGR 1260 



Qy 2238 SWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDW 2297 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I 

Db 1261 SWLTSHSMEECE7VLCTRLAIMVNGRLHCLGSIQHLKNRFGDGYMITVRTKSSQNVKDW 1320 

Qy 2298 RFFNRNFPEAMLKERHHTKVQYQLKSEHI SLAQVFSKMEQVSGVLGIEDYSVSQTTLDNV 2357 

I I II I I I I I I : : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 RFFNRNFPEAHAQGKTPYKVQYQLKSEHISLAQVFSKMEQWGVLGIEDYSVSQTTLDNV 1380 

Qy 2358 FVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDTEDEG 2417 

I I I I I I I I I I I : I I I I I I I : I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 1381 FVNFAKKQSDNVEQQEAE-PSSLPSPLG-LLSLLRPRPAPTELRALVADEPEDLDTEDEG 1438 



Qy 2418 LISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I ! I I I I I 

Db 1439 LISFEEERAQLSFNTDTLC 1457 



RESULT 2 

US-08-762-500-27 

Sequence 27, Application US/08762500 
Patent No. 6030806 
GENERAL INFORMATION: 

APPLICANT: Landes, Gregory M. 
APPLICANT: Burn, Timothy C. 
APPLICANT: Connors, Timothy D. 
APPLICANT: Dackowski, William R. 
APPLICANT: Van Raay, Terence J. 
APPLICANT: Klinger, Katherine W. 

TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, 

TITLE OF INVENTION: COMPOSITIONS, METHODS OF MAKING AND USING SAME 
NUMBER OF SEQUENCES: 83 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: GENZYME CORPORATION 
STREET: One Mountain Road 
CITY: Framingham 
STATE: Massachusetts 
COUNTRY: United States of America 
ZIP: 01701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/762,500 
FILING DATE: 09-DEC-1996 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/665,259 
FILING DATE: 17-JUN-1996 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/US96/104 69 
FILING DATE: 17-JUN-1996 
ATTORNEY/AGENT INFORMATION: 
NAME : Dugan, Deborah A. 



REGISTRATION NUMBER: 37,315 
REFERENCE/ DOCKET NUMBER: IG5-9.3 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (508) 872-8400 
TELEFAX: (508) 872-5415 
INFORMATION FOR SEQ ID NO: 27: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1457 amino acids 
TYPE: amino acid 
STRANDEDNESS : not relevant 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-762-500-27 

Query Match 55.6%; Score 7046; DB 3; Length 1457; 

Best Local Similarity 94.2%; Pred. No. 0; 

Matches 1374; Conservative 22; Mismatches 59; Indels 4; Gaps 4; 

Qy 980 MEEEPTHLP LVVC VX)KLTKVTKDDKKLALNKLS LNL YENQVVS FLGHNGAGKTTTMS I LT 1039 

I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MEEEPTHLPLVVCVDKLTKVYKNDKKLALNKLSLNLYENQVVS FLGHNGAGKTTTMS I LT 60 

Qy 104 0 GLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEE 1099 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 GLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEE 12 0 

Qy 1100 IRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYAR 1159 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 IRKETDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYAR 180 

Qy 1160 RAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRL 1219 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 RAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGAYXDGYRL 240 

Qy 1220 TLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYIL 1279 

1111:11111 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 TLVKQPAEPGTSQEPGLASSPSGCPRLSSCSEPQVSQFIRKHVASSLLVSDTSTELSYIL 300 

Qy 1280 PSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKES 1339 

I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 PSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKES 360 

Qy 134 0 RKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSAS SVGSARGDEGAGYTDVYGDYRP 1399 

I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I : I I I I I I I 

Db 361 RKDVLPGAEGLTAVGGQAGNLARCSELAQSQASLQSASSVGSARGEEGTGYSDGYGDYRP 420 

Qy 1400 LFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVR 1459 

I I I I I I I I I I I I I I I I I I :: I I I I I I I I : I I I I : I I I I I I I I I I I I I I I I I I I I I 
Db 421 LFDNLQDPDNVSLQE7^EMEALAQVGQGSRKLEGWWLKMRQFHGLLVKRFHCARRNSKALC 480 

Qy 1460 SQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLR 1519 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I : I I I I I 
Db 481 SQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERQEYRLR 540 

Qy 1520 LSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLES 157 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 LSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPMLNLSSGESRLLAARFFDSMCLES 600 



1580 FTQGL P L S N FVP P P P S PAP S D S PAS P DED- LQAWNVS L P PT AG P EMWT SAP S L PRLVRE P 1638 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I II II I M 

601 FTQGLPLSNFVPPPPSPAPSDSPVXPDEDSLQAWNMSLPPTAGPETWTSAPSLPRLVHEP 660 

1639 VRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAI 1698 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 VRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAI 720 

1699 TFGNVLKSIPAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPK 1758 
I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I II I I I I I II I I I I I I I I M I I I I 
721 T FGNVQK S I PAS FGARVP PMVRK I AVRRVAQVL YNN KG YH SMPT Y LN S LNNAI L RAN L P K 780 

1759 SKGNP/^YGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEK 1818 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 S KGN PAAYX I TVTNH PMNKT S AS LS LD YLLQGTDWI AI FI I YAMS FVP AS FWFLVAEK 840 

1819 STKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLS 1878 

I I I II I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 S T KAKH LQ FVS GCN P VI YWLAN YVWDMLN YLVPAT CCVI I LFVFDL PAYT S PTN FPAVL S 900 

1879 LFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKW 1938 

I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 LFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKW 960 

1939 NSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAV 1998 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 NSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMTV 1020 

1999 EGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMVKI ENLTK 2058 

II II I I I I I I I I I I I I : I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1021 EGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQRVLRGDADNDMVKI ENLTK 108 0 

2059 VYKSRKIGRILAVDRLCLGV-RPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGH 2117 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
1081 VYKSRKIGRILAVDRLCLGVCVPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGH 1140 

2118 SVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELT 2177 

I I I I : I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I : I I I I I I I I I I I I 
1141 SVLKDLLQVQQSLGYCPQFDVPVDELT7VREHLQLYTRLRCIPWKDEAQVVKWALEKLELT 1200 

2178 KYADKPAGTYSGGNKRKLSTAIALIGYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTGR 2237 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I 
1201 KYADKPAGT YSGGNKRKLSTAIALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTGR 1260 

2238 SWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDW 2297 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : II I I I 

1261 SWLTSHSMEECEALCTRLAIMVNGRLHCLGSIQHLKNRFGDGYMITVRTKSSQNVKDW 1320 

2298 RFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNV 2357 

I I I I I I I I I I : : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 RFFNRNFPET^HAQGKTPYKVQYQLKSEHISLAQVFSKMEQWGVLGIEDYSVSQTTLDNV 1380 

2358 FVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDTEDEG 2417 

I I I I I I I I I I I : I I I I I 11*1 MM t I I I I I I I I I I t I I 1 I I I I I I I I I I I 1 I I I 

1381 FVNFAKKQSDNVEQQEAE-PSSLPSPLG-LLSLLRPRPAPTELRALVADEPEDLDTEDEG 1438 



Qy 2418 LISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I M I I I I I 

Db 1439 LISFEEERAQLSFNTDTLC 1457 



RESULT 3 

US-09-526-193A-1 

; Sequence 1, Application US/09526193A 

; Patent No. 6617122 

; GENERAL INFORMATION: 

; APPLICANT: Hayden, Michael R. 

APPLICANT: Brooks-Wilson, Angela R. 
; APPLICANT: Pirns tone, Simon N . 

; TITLE OF INVENTION: METHODS AND REAGENTS FOR MODULATING 
; TITLE OF INVENTION: CHOLESTEROL LEVELS 
; FILE REFERENCE: 50110/002005 

; CURRENT APPLICATION NUMBER: US/09/526, 193A 

; CURRENT FILING DATE: 2000-03-15 

; PRIOR APPLICATION NUMBER: 60/124,702 

; PRIOR FILING DATE: 1999-03-15 

; PRIOR APPLICATION NUMBER: 60/138,048 

; PRIOR FILING DATE: 1999-06-08 

; PRIOR APPLICATION NUMBER: 60/139,600 

; PRIOR FILING DATE: 1999-06-17 

; PRIOR APPLICATION NUMBER: 60/151,977 

; PRIOR FILING DATE: 1999-09-01 

; NUMBER OF SEQ ID NOS : 287 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 1 

LENGTH: 2261 
TYPE : PRT 
; ORGANISM: Homo sapiens 
US-09-526-193A-1 

Query Match 33.5%; Score 4240.5; DB 4; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 0; 

Matches 1000; Conservative 345; Mismatches 730; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I :. I I : I I I : : I I I I I : I I 

Db -6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 



Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I : : I : 

Db 65 GT L P WVQG 1 1 CN ANN P C FRY PT P GEAP GWGN FN K S I VARL F S D ARRL L LYSQKDT 120 

Qy 116 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : . : I I I : : I : Mil I I I I I I I 

Db 121 SMKDMRKVLRTLQQIKKSSSNLKLQDFLVDNETFSG FLYHNLSLP 165 

Qy 174 NSTAQALLAARVT)PPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : I I : : 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II | | : : : | : | : I I : : I : 

Db 205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 



Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALL L PQ GACT GRT P G P PAS GAGGAAN GTGAGAVMGPNATAEEGAP SAAALAT P 396 

' : I : III : I : I I I I : II 

Db 294 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEED7VETFYDNSTTP 353 

Qy 397 DTLQGQCSAFVQ— LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I :: . I : : : I I : I : I I 
Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 452 HLMTSNPKILYAPAGSEVDRVILKANETFAFVGNVTHYAQVWLNISAEIRSFLEQGRLQQ 511 

I I I I I : | : : | : | | : : | : | : | : | : | : 

Db 382 — KILYTPDTPATRQVMAEVNKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

: I I I I I I I I I : I i : I : I 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN — 492 

Qy 548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : II :|| |::: : I ::| :: I :| 

Db 493 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME--LLDERKFWAG 535 

Qy 608 VI FQTRKDGS — LPPHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTG GRFYFLYGFVW 662 

::| II II II MM : |:||:|: Mill I II : 

Db 536 I VFTGITPGS I ELPHHVKYKI RMDI DNVERTNKI KDGYWDPGPRADP FEDMRYVWGGFAY 595 

Qy 663 IQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVI EHMMPLCMVISWVYSVAM 722 

: M : : I : II I I : : I I : I Mill Mil: I I I I : : I : I I I I : 

Db 596 LQDWEQAIIRVLTGTE-KKTGVYMQQMPYPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 654. 

Qy 723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I : I I I I I I I I I I : I I I : I :: I : I I I : : I : I I III I M : I 
Db 655 1 1 KGIVYEKEARLKETMRIMGLDNSILWFSWFI S SLI PLLVSAGLLWILKLGNLLP YSD 714 

Qy 783 WI IWLFLAWAVATIMFCFLVSVLYSKAKLASACGGI I YFLS WPYMYVAI REEVAHDK 842 

: : : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I MM: II 
Db 715 PSWFVFLSVFAWTI LQCFLI STLFSRANLAAACGGI I YFTLYLPYVLC VAWQD 769 

Qy 843 I TAFE- KC IAS LMSTTAFGLGS KYFAL YEVAGVGI QWHT FSQS PVEGDDFNLLLAVTMLM 901 

I I II I : I I I I I : II I I : I I : I : I I Mill I III MM:: 

Db 770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

Qy 902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I M I : : I I I I I I I I I I I :| I I I I I I Mill |:|||: I : I 

Db 830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS — 884 

Qy 962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I I I I I I I I I I : I I I I : I I : I : : I : I I I I I : 

Db 885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

Qy 1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 

M I II II I II I I I I I I II I I M I I I I M I I I I I : I I II M I I : I I I I I I I I I II 
Db 930 S FLGHNGAGKTTTMS I LTGLFPPTSGTAYI LGKDI RS EMSTI RQNLGVCPQHNVLFDMLT 989 



Qy 1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

Ml |:| Ihlll :::::: ||::| h i hi I I I I I I : I I I I I I : I I I I 

Db 990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 1049 

Qy 1141 GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

||: : | | | M I I I I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I I I 
Db 1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDE7VDVLGDRIAIISHGKL 1109 

Qy 12 01 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I I I I I I I I I I I : I : III 

Db 1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

Qy 1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : | Mill: III I I : I : I I I I I I : I I I II : : 

Db 1170 DHESDTLTIDVS — AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

Qy 1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : I I I I I : I I I I : I I I I I : I I 
Db 1228 RLSDLGI SS YGI SETTLEEI FLKVAEE SGVDA-ETSDGTLP 1267 

Qy 1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD— NVSLQEV 1415 

I I : I : I II | : | | : :: : 

Db 12 68 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

Qy 1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 1474 

| : | I : I : I I : : I I : I I I I I I 111=1 I : I I : I I I I I I : I : 

Db 1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

Qy 1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I : I : : 

Db 1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 1408 

Qy 1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: :l I: 
Db 14 09 TKDPGFGTRCM EGNPI 1424 

Qy 1594 P S PAP S D S PAS P DE DLQAWNVS L P P TAG P EMWT SAP S L P RL VRE P VR C 1641 

II | | | | | I : I I : I : : : : I 

Db 1425 PD TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 14 63 

Qy 1642 TCSAQGTGFS CPSSVGG-HPPQMRVVTGDILTDITGHNVSEYLLFTSDRF 1690 

||: || II III: I II I I : I I I : I : II : I : 

Db 1464 QC S S DKI KKMLPVC PP GAGGLP P PQRKQNTADI LQDLTGRN I S D YLVKT YVQI I AKS LKN 1523 

Qy 1691 RLHRYGAI T FG — NVLKS I PAS FGTRAP PMVRK : 1721 

I I I : I I I : I : : I 

Db 1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

Qy 1722 IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRT^NLPKSKGNPAAYGITVTNHPMNKTSAS 1781 

: | : | : : | | | | : | : : : : I I : I I I I I I I I I I : I I : I I I I I I I : I I 
Db 1584 LDTRNWKWFNNKGWHAISSFLWINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

Qy 1782 LS - LD YLLQGTDWI AI FI I VAMS FVPAS FWFLVAEKST KAKHLQ FVS GCN P 1 1 YWLAN 1840 

|| : : | | : : : | : | | | | I I I I I I I I I I : I : : I I M I I I : I I hlllhl 

Db 1643 L S EVALMTT S VDVLVS I C VI FAMS FVPAS FWF L I QE RVS KAKH LQ FI S GVK PVI YW L SN 1702 



Qy 



1841 YVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 



: I I I I II : I I I I : I I I : I I I I I : I I I I I I I I I I : I I I I I I I : : 

Db 1703 FVWDMCNYWPATLVI 1 1 FI C FQQKS YVS STNLPVLALLLLLYGWS IT PLMYP AS FVFKI 1762 

Qy 1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

||:||| : I I I I I I : I I I I : I : I I I I : I I I I I I I I I :: I I I I : : 

Db 1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

| | : : : : | : . : : II I I : I I I I I I I I I I I I I : I I I I II*. 
Db 1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

Qy 2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I : I I I I I I I I : I I I I : : : I : I I I : I : : I I I I I : I : I * 
Db 1881 VNAKLS PLNDEDEDVRRERQRI LDGGGQNDI LEI KELTKI YRRK RKPAVDRI CVGI P 1937 

Qy 2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

- | | | | | | | I I I I I I I I : I I I I I I I I I : I I : I I : I : I : I : : I I :: I I I I I Ih 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: | | | | | : : : I I I : I : : I : I I : I I I I I : I- I I I I I I I I I I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTGRSVVLTSHSMEECEALCTRLAIM 2259 

Mill :MI Ml I I II ::| Mill II IMM Ml II IMMII 

Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 

Qy 2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I II I I I I I : I I I I I I I I I I I I I I I : =11 II II : : II I : I : I 

Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

II I | I M I I : I I M II II II I I I I II I I II I Ml: 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 

RESULT 4 

US-08-665-259-26 

Sequence 26, Application US/08665259 
Patent No. 6028173 
GENERAL INFORMATION: 

APPLICANT: Landes, Gregory M. 
APPLICANT: Burn, Timothy C. 
APPLICANT: Connors, Timothy D. 
APPLICANT: Dackowski, William R. 
APPLICANT: Van Raay, Terence J. 
APPLICANT: Klinger, Katherine W. 

TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, 

TITLE OF INVENTION: COMPOSITIONS, METHODS OF MAKING AND USING SAME 
NUMBER OF SEQUENCES: 73 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: GENZYME CORPORATION 
STREET: One Mountain Road 
CITY: Framingham 
STATE: Massachusetts 
COUNTRY: United States of America 
ZIP: 01701 



COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/665, 259 
FILING DATE: 17-JUN-1996 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION : 
NAME: Dugan, Deborah A. 
REGISTRATION NUMBER: 37,315 
REFERENCE/ DOCKET NUMBER: IG5-9.1 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (508) 872-8400 
TELEFAX: (508) 872-5415 
INFORMATION FOR SEQ ID NO: 26: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1375 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-665-259-26 

Query Match 25.1%; Score 3173.5; DB 3; Length 1375; 

Best Local Similarity 46.5%; Pred. No. 5.4e-274; 

Matches 683; Conservative 205; Mismatches 372; Indels 209; Gaps 30; 

Qy 980 MEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILT 1039 

I I I I I I I I I I : I I I I : I I : I :: I : I I I I I : I I I I I I I I I I I I I I I I I I 
Db 2 MEEEPTHLRLGVSIQNLA^WRDGMKVAVDGLALNFYEGQITSFLGHNGAGKTTTMSILT 61 

Qy 104 0 GLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEE 1099 

I I I I I I I I : I I I I I I : I I I I : I I I : I I I I I I I I I I I I I I I : I I I : I I I - - 

Db 62 GLFPPTSGTAYI LGKDI RSEMS S I RQNLGVCPQHNVLFDMLTVEEHIWFYARLKGLSEKH 121 

Qy 1100 IRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYA 1158 

: : I I : : I I : I : I I I I I I I : I I I I I I : I I I I I I : : I I I I I I I I I I I I : 
Db 122 VKAEMEQMALDVGLPPSKLKSKTSQLSGGMQRKLSVALAFVGGSKWILDEPTAGVDPYS 181 

Qy 1159 RRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYR 1218 

II I I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I II 

Db 182 RRGIWELLLKYRQGRTIILSTHHMDEADILGDRIAIISHGKLCCVGSSLFLKNQLGTGYY 241 

Qy 1219 LTLVKRPAEPG GPQEPGLASSPPGRAPLSSCSELQVSQ 1256 

I I I I I : I : I I I I : I 

Db 242 LTLVKKDVESSLSSCRNSSSTVSCLKKEDSVSQSSSDAGLGSDHESDTLTIDVS — AISN 299 

Qy 1257 FIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLE 1316 

Mill: II I I I : I : I I I I I I': I I I II : : I I : I I : I : : I I I I 
Db 300 LIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDDRLSDLGISSYGISETTLE 359 

Qy 1317 EVFLKVS EEDQ S LEN S EADVKE S RKDVL PGAEG P AS GEGHAGNLARC S ELTQ SQAS LQ S A 1376 

I : I I I I : I I III: II 
Db 360 EIFLKVAEE SGVDA-ETSDGTLP — 381 



1377 SSVGSARGDEGAGYTDVYGDYRPLF DNPQDPD--NVSLQEVEAEALSRV-GQGSR 1428 

||: I : I I : I : I I : : : : "I : I I : I : I I 

382 ARRNRRA FGDKQSCLHPFTEDDAVDPNDSDIDPESRETDLLSGMDGKGSY 431 

1429 KLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPL 1488 

: I I I : I I I I I I III: I I : I I : I I I I I I : I : MM I ■ I I 
432 QLKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALVFSLIVPPFGKYPSL 491 

1489 VLS PSQYH-NYTQPRGNFI PYANEERREYRLRLS PDAS P QQ LVS T FRL P S GVGAT CVLKS 1547 
I I |: II : : : | I |:|:: III:: 

492 ELQPWMYNEQYT FVSNDAPE 1 DMGTQELLNALTKDPGFGTRCMEGN 536 

154 8 PANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDE 1607 

| 1:11 s = I 

537 PIP DTPCL AGEE 548 

1608 D LQAWNVS L P P TAG P EM WTSAPSLPRLVREPVRCTCSAQGTGFS CPSSVGG- 1658 

I I : I | : : : II I III: M II 

549 D WTISPVPQSIVDLFQNGNWTMKNPSP ACQCSSDKIKKMLPVCPPGAGGL 598 

1659 HPPQMRWTGDILTDITGHNVSEYLLFTSDRF RLHRYGAITFG 1701 

III: I I I I : : I I I : I : I I : I : Ml : I 

599 PPPQRKQKTADILQNLTGRNISDYLVKTYVQIIAKSLKNKIWVNEFRYGGFSLGVSNSQA 658 

170 2 N VL K S I PAS FGT RAP PMVRK I A VRRAAQVFYNNKGYHSMPT 1742 

: | : : I I : : : :|::||||:|:: : 

659 LPPSHEVMDAI KQMKKLLKLTKDTSADRFLSSLGRFMAGLDTKNNVKVWFNNKGWHAISS 718 

1743 YLNSLNNAILRANLPKSKGNP7\AYGITVTNHPMNKTSASLS-LDYLLQGTDWIAIFIIV 1801 

: I I : I I I I I I I I I I : I I : I I I I I I I : I I I I : - I I : : : I : I 

719 FLNVINNAILRANLQKGE-NPSQYGITAFNHPLNLTKQQLSEV7VLMTTSVDVLVSICVIF 777 

1802 AMS FVPAS FWFLVAEKSTKAKHLQFVS GCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I LFV 1861 
MM I II llllll: I: :| 1.11111:1 I I : I II I : I : I I I I I I : I I II Ml 
778 AMS FVPAS FWFLIQERVSKAKHLQFI SGVKPVI YWLSNFVWDMCN YWPATLVI 1 1 FI C 837 

1862 FDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATV 1921 

I MIMI: I II I II I II I : I I I I I I I :M I : I I I I M I II II M 
838 FQQKSYVSSTNLPVLALLLLLYGWSITPLMYPASFVFKIPSTAYVVLTSVNLFIGINGSV 897 

1922 ATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMK 1981 

I I I : I : I I ::| I M III Mill:: || ||::| |: : : : |: :: 
898 ATFVLELFTNNK-LNDINDILKSVFLIFPHFCLGRGLIDMVKNQAMADALERFGE-NRFV 955 

1982 SPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVED-DVDVASERQR 2040 

I I I I : I I I II I II II I I I : I : : I I I II:: MIMI MM 

956 SPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRPVKAKLPPLNDEDEDVRRERQR 1015 

2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

:| | ||:::|: |||:|: : I II I I : I : I : I I II II I II I I I I II : M I I I 
1016 ILDGGGQNDI LEI KELTKI YRRK RKPAVDRICIGIPPGECFGLLGVNGAGKSTTFKM 1072 

2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

MM | | : || : | : I : I : : I I : : I I I I I I I : : I I I I I : : : MM 
1073 LTGDTPVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAITELLTGREHVEFFALLRGVPE 1132 

2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 



I : : : I I : I I I I I : I I I I I I I I I I I I I I : I I I I I : I I I I I I I I I I I I 

1133 KEVGKFGEWAI RKLGLVKYGEKYASN YSGGNKRKLSTAMALI GGPPWFLDEPTTGMDPK 1192 



Qy 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 22 80 

I I I I I I I | :: | I I I I I I I I I I I I I I I I I I I I : II I I I I I I I I I I : I I I I I I I I I I 

Db 1193 ARRFLWNCALSIVKEGRSWLTSHSMEECEALCTRMAIMVNGRFRCLGSVQHLKNRFGDG 1252 

Qy 2281 YMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVS 2339 

| | || |: : | I II II : : I I I : I : I I I I I III:: I I : I 
Db 1253 YTIWRIAGSNPDLKPVQEFFGLAFPGSVLKEKHRNMLQYQLPSSLSSLARIFSILSQSK 1312 

Qy 2340 GVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

I I I I I I I I I I I I I I I I I I I I III: 
Db 1313 KRLHIEDYSVSQTTLDQVFVNFAKDQSDD 1341 

RESULT 5 

US-08-762-500-26 

Sequence 26, Application US/08762500 
Patent No. 6030806 
GENERAL INFORMATION: 

APPLICANT: Landes, Gregory M. 
APPLICANT: Burn, Timothy C. 
APPLICANT: Connors, Timothy D. 
APPLICANT: Dackowski, William R. 
APPLICANT: Van Raay, Terence J. 
APPLICANT: Klinger, Katherine W. 

TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, 

TITLE OF INVENTION: COMPOSITIONS, METHODS OF MAKING AND USING SAME 
NUMBER OF SEQUENCES: 83 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: GENZYME CORPORATION 
STREET: One Mountain Road 
CITY: Framingham 
STATE: Massachusetts 
COUNTRY: United States of America 
ZIP: 01701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS /MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/762, 500 
FILING DATE: 09-DEC-1996 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/665,259 
FILING DATE: 17-JUN-1996 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/US96/ 104 69 
FILING DATE: 17-JUN-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Dugan, Deborah A. 
REGISTRATION NUMBER: 37,315 
REFERENCE/ DOCKET NUMBER: IG5-9.3 
TELECOMMUNICATION INFORMATION: 



TELEPHONE: (508) 872-8400 
TELEFAX: (508) 872-5415 
INFORMATION FOR SEQ ID NO: 26: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1375 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-762-500-26 

Query Match 25.1%; Score 3173.5; DB 3; Length 1375; 

Best Local Similarity 46.5%; Pred. No. 5.4e-274; 

Matches 683; Conservative 205; Mismatches 372; Indels 209; Gaps 30; 

Qy 980 MEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILT 1039 

I | || I I I I I I : I I I I : I I : I :: I : I I I I I : I I I I I I I I I I I I 

Db 2 MEEEPTHLRLGVS I QNLVKVTRDGMKVAVT)GLALNFYEGQI TS FLGHNGAGKTTTMS I LT 61 

Qy 104 0 GLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEE 1099 

I | | | | | | | : I I I llhll I I : I I I : I I I I I I I I I I I I I I I : I I I : I I I : : : = 

Db 62 GLFPPTSGTAYILGKDIRSEMSSIRQNLGVCPQHNVLFDMLTVEEHIWFYARLKGLSEKH 121 

Qy 1100 IRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYA 1158 

: : | | : : I I : I : I I I I I I I : I I I I I I : I I I I I I :: I I I I I I I I I I I I : 

Db 122 VKAEMEQMALDVGLP P S KLKS KT SQLS GGMQRKLS VALAFVGGS KVVI LDEPTAGVDP YS 181 

Qy 1159 RRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYR 1218 

II | I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I I I I II I I I I I I I 

Db 182 RRGIWELLLKYRQGRTIILSTHHMDEADILGDRIAIISHGKLCCVGSSLFLKNQLGTGYY 241 

Qy 1219 LTLVKRPAEPG GPQEPGLASSPPGRAPLSSCSELQVSQ 1256 

I I I I I : I -HI I : I 

Db 242 LTLVKKDVESSLSSCRNSSSTVSCLKKEDSVSQSSSDAGLGSDHESDTLTIDVS AISN 299 

Qy 1257 FIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLE 1316 

Mill: Ml I I : I : I I I I I I : I I I II : : I I : I I : I : : I I I I 
Db 300 LI RKHVS EARLVEDI GHELTYVLPYEAAKEGAFVELFHEI DDRLSDLGI SSYGI SETTLE 359 

Qy 1317 EVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSA 1376 

1:1 llhll I I I: I I 
Db 360 EIFLKVAEE SGVDA-ETSDGTLP 381 

Qy 1377 SSVGSARGDEGAGYTDVYGDYRPLF DNPQDPD — NVSLQEVEAEALSRV-GQGSR 1428 

| | : | : I I : I : I I : : : : I : I I : I : I I 

Db . 382 ARRNRRA FGDKQSCLHPFTEDDAVDPNDSDIDPESRETDLLSGMDGKGSY 431 

Qy 1429 KLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPL 1488 

: I I | : I I MM II I : I 1*11*111 111*1- : I I I I I I 
Db 432 QLKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALVFSLIVPPFGKYPSL 491 

Qy 1489 VLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKS 1547 

|||:|| : : : I | | : I :: MM: 

Db 4 92 ELQPWMYNEQYT FVSNDAPE DMGTQELLNALTKDPGFGTRCMEGN 536 

Qy 1548 PANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDE 1607 

| I : II : : I 



537 PIP 



DTPCL 



•AG EE 548 



1608 D LQ AWNVS L P P TAG P EM WTSAPSLPRLVREPVRCTCSAQGTGFS CPSSVGG- 1658 

| | :| | : :: II I III: II I I 

54 9 D WTISPVPQSIVDLFQNGNWTMKNPSP — ACQCSSDKIKKMLPVCPPGAGGL 598 

1659 HPPQMRWTGDILTDITGHNVSEYLLFTSDRF RLHRYGAITFG 1701 

Ml: I I I I : : I I I : I : I I : I : I I I • I 

599 PPPQRKQKTADILQNLTGRNISDYLVKTYVQII7^KSLKNKIWVNEFRYGGFSLGVSNSQA 658 

1702 NVLKS I PAS FGTRAP PMVRKI A VRRAAQ VFYNNKG YH SMPT 1742 

: | : : | | : : : : | : : | | | | : I : : : 

659 LPPSHEVNDAIKQMKKLLKLTKDTSADRFLSSLGRFMAGLDTKNNVKVWFNNKGWHAISS 718 

1743 YLNSLNNAI LRANLPKSKGNPAAYGITWNHPMNKTSASLS-LDYLLQGTDWIAI FI I V 18 01 

: I I : I I I II I I I I I : I I : I I II I I I : I I I I : : ||:::| :| 

719 FLNVINNAILRANLQKGE-NPSQYGITAFNHPLNLTKQQLSEVALMTTSVDVLVSICVIF 777 

1802 AMS FVPAS FWFLVAEKSTKAKHLQFVSGCNPI I YWLAN YVWDMLNYLVPATCCVI I LFV 1861 
I | | | | | I I I I I I I : I : : I I I I I I I : I I I : I I I I : I : I I I I I I : I I I I : I I 
778 AMS FVPAS FWFLIQERVSKAKHLQFI SGVKPVI YWLSNFVWDMCN YWPATLVI 1 1 FI C 837 

1862 FDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATV 1921 

I : I I I I I : I I I I I I I I I I : I I I I I I I I I : I I I I : I I I I I I = I 

838 FQQKSYVSSTNLPVLALLLLLYGWSITPLMYPAS FVFKIPSTAYWLTSVNLFIGINGSV 897 

1922 ATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMK 1981 

I I I : I : I I : : I I : I I I I I I I I I :: I I I I :: I I : : : : I : :: 

898 ATFVLELFTNNK-LNDINDILKSVFLIFPHFCLGRGLIDMVKNQAMADALERFGE-NRFV 955 

1982 SPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVED-DVDVASERQR 2040 

II I I : I I I I I I I I I I I I I : I : : I I I I I : : I : I I I I MM 

956 SPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRPVKAKLPPLNDEDEDVRRERQR 1015 

2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

: | | | | : : : I : | | I : I : : I I I I I : I : I : II I I M I II M I I II :: II I I 
1016 I LDGGGQNDI LEI KELTKI YRRK RKPAVDRICIGIPPGECFGLLGWGAGKSTTFKM 1072 

2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

| | M I I : I I : I : I : I : : I I : : I I I I I I I : : I I I I I : : : MM 
1073 LTGDT P VT RGDAFLN KN S I LSN I HEVHQNMG YC PQ FDAI T ELLTGREHVE FFALLRGVP E 1132 

2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

| : : : I I : I I I I I : I I I I I II II M I I I : I I I I I : I II I I I I I I I I I 
1133 KEVGK FGEWAI RKLGLVK YGEKYASN YS GGNKRKL S T AMAL I GG P P WFLDE PTT GMD P K 1192 

2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

M II I I I I : : I I I II I I I II I I II II I II M : M I I I I I I I II I : I I I I I I II I I 

1193 ARRFLWNCALSIVKEGRSWLTSHSMEECEALCTRMAIMVNGRFRCLGSVQHLKNRFGDG 1252 

2281 YMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVS 2339 

I I II I: : I I II II ::MI:I Mill I I I I : : I I : I 
1253 YTIVVRIAGSNPDLKPVQEFFGLAFPGSVLKEKHRNMLQYQLPSSLSSLARI FSILSQSK 1312 

2340 GVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

I II I I II I I II M II I II II Ml: 

1313 KRLHIEDYSVSQTTLDQVFVNFAKDQSDD 1341 



RESULT 6 

US-08-665-259-25 

Sequence 25, Application US/08665259 
Patent No. 6028173 
GENERAL INFORMATION: 

APPLICANT: Landes, Gregory M. 
APPLICANT: Burn, Timothy C. 
APPLICANT: Connors, Timothy D. 
APPLICANT: Dackowski, William R. 
APPLICANT: Van Raay, Terence J. 
APPLICANT: Klinger, Katherine W. 

TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, 

TITLE OF INVENTION: COMPOSITIONS, METHODS OF MAKING AND USING SAME 
NUMBER OF SEQUENCES: 73 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: GENZYME CORPORATION 
STREET: One Mountain Road 
CITY: Framingham 
STATE: Massachusetts 
COUNTRY: United States of America 
ZIP: 01701 
COMPUTER READABLE* FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/665,259 
FILING DATE: 17-JUN-1996 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
NAME: Dugan, Deborah A. 
REGISTRATION NUMBER: 37,315 
REFERENCE/ DOCKET NUMBER: IG5-9 . 1 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (508) 872-8400 
TELEFAX: (508) 872-5415 
INFORMATION FOR SEQ ID NO: 25: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1684 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-665-259-25 

Query Match 20.7%; Score 2622; DB 3; Length 1684; 

Best Local Similarity 34.0%; Pred. No. 2.1e-224; 

Matches 638; Conservative 317; Mismatches 556; Indels 364; Gaps 45; 

Qy 581 KGFPDEESIVNYTLNQAYQDNV — TVFASVI FQ TRKDGSLPPHVHYKIR 627 

: | | I I : : I I I : I I : I : I : II I I : I 

Db 88 RGFPSEKDFEDY IRYDNCSSSVLAAVVFEHPFNHSKEPLPLAVKYHLRFSYTRRNY 143 

Qy 628 QNSSFTEK TNEIRRAYWRPG PNTGGRFYFLYGFVWIQDMMERAI I 672 

| M I I : : I I I : I I I I : : I : : I I I : 



Db 144 MWTQTGSFFLKETEGWHTTSLFPLFPNPGPRELTSPDGGEPGYIREGFLAVQHAVDRAIM 203 

Qy 673 DTFVGHDWEPGSY VQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAMTIQH 726 

: I : : : | | | | : Ml I : : : I I : : : I : I : : 

Db 204 EYHA — DAATRQLFQRLTVTIKRFPYPPFIADPFLVAIQYQLPLLLLLSFTYTALTIARA 261 

Qy 727 IVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLS I SVT ALTAI L KYGQVLMHS 781 

: I I I I I I I I I : I I I : : : I I I I I : I : I I : : : I : II I 

Db 262 VVQEKERRLKEYMRMMGLSSWLHWSAWFLLFFLFLLIAASFMTLLFCVKVKPNVAVLSRS 321 

Qy 782 HWI I WL FLAVYAVAT I MFC FLVS VL Y S KAKLAS AC G GI I Y FL S YVP YMYVAI RE EVAH D 841 

: : || : I : : I I I I : I I : I I I : I : I I I : I I : I : I I : I I I : : 
Db 322 DPSLVLAFLLCFAI STISFSFMVSTFFSKANMAAAFGGFLYFFTYIPYFFVAPR YN 377 

Qy 842 KITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVE-GDDFNLLLAVTML 900 

: I : I : I : I I : I : : : I I : I I I I III III : I I 

Db 378 WMTLSQKLCSCLLSNVAMAMGAQLIGKFEAKGMGIQWRDL-LSPVNVDDDFCFGQVLGML 436 

Qy 901 MVT)AV\A r GILTWYIElAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

: : I : I : I I : : I I I : I I I I I : I : I : I I I I : I I I I I I : 

Db 437 LLDSVLYGLVTWYMEAVFPGQFGVPQPWYFFIMPSYWCGKPRAVAGK 483 

Qy 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYK — DDKKLALNKLSLNLYEN 1018 

||:: : : I I III I : : I : I I : : : : I : I : I I I I I 

Db 484 -EEEDSDPEKALRNEY FEAE P E DLVAG I KI KH L S KVFRVGN KD RAAVRDLN LN L YEG 539 

Qy 1019 QWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFD 1078 

I : I I I I I I I I I I I : I : I I I I I I I I I I I I I : : I : I : H I : I I : I I I I : : I I I 
Db 540 QITVLLGHNGAGKTTTLSMLTGLFPPTSGRAYISGYEISQDMVQIRKSLGLCPQHDILFD 599 

Qy 1079 RLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAF 1138 

I I I I I I : I I : : I I : : : : : I : : I : : I : I : I : I I I I I : I I I I : I I 
Db 600 NLTVAEHLYFYAQLKGLSRQKCPEEVKQMLHIIGLEDKWNSRSRFLSGGMRRKLSIGIAL 659 

Qy 1139 VGGSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRT I LLSTHHMDEADLLGDRI AI I SHG 1198 

: ||: : I I I I I I : I : I : I I I II I I : : I I I I : I : I I I I I I I I I I I I I I I : : I 
Db 660 IAGSKVLILDEPTSGMDAISRRAIWDLLQRQKSDRTIVLTTHFMDEADLLGDRIAIMAKG 719 

Qy 1199 KLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFI 1258 

: I : I I I I I I I I I I I I : I I II I I = : M : 

Db 720 ELQCCGSSLFLKQKYGAGYHMTLVKEP -HCNPEDISQLV 757 

Qy 1259 RKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEV 1318 

||: II I I I : I I I I : : I I I I 11= I : : I I I I I : I I I 
Db 758 HHHVPNATLESSAGAELSFILPRESTHR— FEGLFAKLEKKQKELGIASFGASITTMEEV 815 

Qy 1319 FLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASS 1378 

I I : I : | : | | :: : I I : I I : I II 
Db 816 FLRVGK LVDSSMDIQAIQ LPALQ--YQHERRASDWAVDSNL— 854 

Qy 1379 VGSARGDEGAGYTDVYGD YRPLFDNPQDPDNVS LQEVEAEALS RVGQGS RKLDGGW- LKV 1437 

I : : I I : I I I I : I I : I I 

Db 855 CGAMDPSDGIG ALIEEERTAV KLNTGLALHC 885 

Qy 1438 RQFHGLLVKRFHC7VRRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHN 1497 

: I I : : I : : I I : : I : I : I I : I : I : I I I I : : I 

v Db 886 QQFWAMFLKKAAYSWREWKMVAAQVLVPLTCWIJ^IAINYSSELFDDPMLRLTLGEY-- 943 



14 98 YTQPRGNFI P YANEERREYRLRLS PDAS PQQLVST FRLPSGVGATCV-LKS PAN GS LGPT 1556 

I I I I I I 

944 GRTWPFSVPGTSQLGQQ 961 

1557 LNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSL 1616 

962 LS 963 

1617 P PT AGP EMWT SAP S L P RLVRE P VRCT C S AQGTG F S C P S S VGGH P P QMRWT GD I LT D I T G 167 6 

I :: 1:1 : I I II: 
964 EHLKDALQAEG QEPREVLGDL — 984 

1677 HNVS EYLLFT S DRFRLHRYGAIT FG — NVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNN 1734 
| : | : | : : : I I : I I I I : I -II 

985 EEFLIFRA- SVEGGGFNERCLVAASF RDVGERTWNALFNN 1024 

1735 KGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQG-TDV 1793 

: I I I I I : : I : I | | | | : | | : : : : I : I 

1025 QAYH S P AT ALAWDN L L F KLLCGPHA-SIWSNFPQPRSALQAAKDQFNEGRKGF 1078 

1794 VIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPAT 1853 

||: : : | I : I : : : I : I : I : : : I I I : I I I I I = : I h : I I : : : : I : I : 
1079 DIALNLLFAI^FLASTFSILAVSERAVQAKHVQFVSGVHVASFWLSALLWDLISFLIPSL 1138 

1854 CCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINL 1913 

: : : I I : I : I : I I I I I I I : I I : I I : I : I : : M I = h 

1139 LLL WFKAFD VRAFT RDGHMADT LLLLLL YGWAI I P LMYLMN F F FLGAAT AYTRLT I FN I 1198 

1914 FIGITATVATFLLQLFEHDKDLKV — VNSYLKSCFLI FPNYNLGHGLMEMAYNEY 1966 

|| MM: : I : : : I I I : M : M : II 

1199 LSGI --ATFLMVTIMRIPAVKLEELSKTLDHVFLVLPNHCLGMAVSSF-YENYETRRY 1253 

1967 INEYYAKIGQFDKMKSPFEWDI — VTRGLVAMAVEGWGFLLTIMCQYNFLRRPQ 2019 

: : | | , : : : I I I : : II I : I : : I I : I : 

1254 CTS SEVAAHYCKKYNIQYQENFYAWSAPGVGRFVASMAASGCAYLI LLFLI ETNLLQRLR 1313 

2020 RMPVSTKPVEDDVDVASERQRVLRGDADNDM VKIENLTKV 2059 

I II I : : I I I I I I I : I I : : : I : I : I I 

1314 GILCALRRRRTLTELYTRMPV LPEDQDVADERTRILAPSPDSLLHTPLIIKELSKV 1369 

2060 YKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSV 2119 

| : | : | | || || I I : II I I I I I I I I I II I : I I I I I I I : I I I hill I I ' 
137 0 YEQRV--PLLAVDRLSLAVQKGECFGLLGFNGAGKTTTFKMLTGEESLTSGDAFVGGHRI 1427 

2120 LKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKY 2179 

::: I : I : II I I I I I I I : I I I I : I I I I I I : h I I I • 

1428 S S DVG KVRQRI G YC P Q FDALL D HMT G REMLVMYARLRG I P E RH I GACVENT L RGLL LE P H 1487 

2180 ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSV 2239 

|:| | | I I I I I I II I I I I II I II lllllhlllll IN II: : ::|::: 
1488 ANKLVRTYSGGNKRKLSTGIALIGEPAVIFLDEPSTGMDPVARRLLWDTVARARESGKAI 1547 

2240 VLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS SQSVKDV 2296 

:: | | | | | | I I I I I I I I I I I I I h :IMI lllhrll II : : : I 
1548 IITSHSMEECEALCTRLAIMVQGQFKCLGSPQHLKSKFGSGYSLRAKVQSEGQQEALEEF 1607 



Qy 2297 VRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDN 2356 

I : I I : : I : : | III : I I : i I : I : I : : I I I I I I : I : 

Db 1608 KAFVDLTFPGSVLEDEHQGMVHYHLPGRDLSWAKVFGILEKAKEKYGVDDYSVSQISLEQ 1667 

Qy 2357 VFVNFAKKQSDNLEQ 2371 

I I :: I I I I : 
Db 1668 VFLSFAHLQPPTAEE 1682 



RESULT 7 

US-08-762-500-25 

Sequence 25, Application US/08762500 
Patent No. 6030806 
GENERAL INFORMATION: 

APPLICANT: Landes, Gregory M. 
APPLICANT: Burn, Timothy C. 
APPLICANT: Connors, Timothy D. 
APPLICANT: Dackowski, William R. 
APPLICANT: Van Raay, Terence J. 
APPLICANT: Klinger, Katherine W. 

TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, 

TITLE OF INVENTION: COMPOSITIONS, METHODS OF MAKING AND USING SAME 
NUMBER OF SEQUENCES: 83 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: GENZYME CORPORATION 
STREET: One Mountain Road 
CITY: Framingham 
STATE: Massachusetts 
COUNTRY: United States of America 
ZIP: 01701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/762,500 
FILING DATE: 09-DEC-1996 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/665,259 
FILING DATE: 17-JUN-1996 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/US96/ 10469 
FILING DATE: 17-JUN-1996 
ATTORNEY/ AGENT INFORMATION: 
NAME: Dugan, Deborah A. 
REGISTRATION NUMBER: 37,315 
REFERENCE/ DOCKET NUMBER: IG5-9.3 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (508) 872-8400 
TELEFAX; (508) 872-5415 
INFORMATION FOR SEQ ID NO: 25: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1684 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 



; MOLECULE TYPE: protein 
US-08-762-500-25 



Query Match 20.7%; Score 2622; DB 3; Length 1684; 

Best Local Similarity 34.0%; Pred. No. 2.1e-224; 

Matches 638; Conservative 317; Mismatches 556; Indels 364; Gaps 45; 

Qy 581 KGFPDEESIVNYTLNQAYQDNV— TVFASVIFQ TRKDGSLPPHVHYKIR 627 

: I I I I : :| II :| |:|:|: I I I I : J 

Db 88 RGFPSEKDFEDY IRYDNCSSSVLAAWFEHPFNHSKEPLPLAVKYHLRFSYTRRNY 143 

Qy 628 QNSSFTEK TNEI RRAYWRPG PNTGGRFYFLYGFVWIQDMMERAI I 672 

| | | I I : : I I I : I I | I : : I :: I I I : 

Db 144 MWTQTGSFFLKETEGWHTTSLFPLFPNPGPRELTSPDGGEPGYIREGFLAVQHAVDRAIM 203 

Qy 673 DTFVGHDWEPGSY VQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAMTIQH 726 

: | : : : | I I I : I I I I :: : I I : :: I : I : . ' 

Db 2 04 EYHA--DAATRQLFQRLTVTIKRFPYPPFIADPFLVAIQYQLPLLLLLSFTYTALTIARA 2 61 

Qy . 727 IVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAIL KYGQVLMHS 781 

: I I I I I I I I I : I I I : : : I I I I I : I : I I : : : I : II I 

Db .262 WQEKERRLKEYMRMMGLSSWLHWSAWFLLFFLFLLIAAS FMTLLFCVKVKPNVAVLSRS 321 

Qy 782 HWIIWLFLAWAVATIMFCFLVSVT.YSKAKLASACGGIIYFLSYVPYMYVAIREEVAHD 841 

: : || : | : : M | I : I I : I I I M M M M I M M I M I I : : 
Db 322 DPSLVLAFLLCFAISTISFSFMVSTFFSKANMAAAFGGFLYFFTYIPYFFVAPR YN 377 

Qy 842 KITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVE-GDDFNLLLAVTML 900 

: | : | : I : I I : I : : : I I : I I I I Ml III : I I 

Db 378 VMTLSQKLCSCLLSNVAMAMGAQLIGKFEAKGMGIQWRDL-LSPVWDDDFCFGQVLGML 436 

Qy 901 MVTlAWYGILTWYIEAVTiPGMYGLPRPWYFPLQKSYW^ 960 

: : I : I : I I : : I I I : I I I I I : I : I : I I I I : I I I I I I : 
Db 437 LLDSVLYGLVTWYMEAVFPGQFGVPQPWYFFIMPSYWCGKPRAVAGK 483 

Qy 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYK — DDKKLALNKLSLNLYEN 1018 

||:: : : I I III I : : I M I : : : : I : 1 = 11111 

Db 484 -EEEDSDPEKALRNEY FEAEPEDLVAGIKIKHLSKVFRVGNKDRAAVRDLNLNLYEG 539 

Qy 1019 QWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFD 1078 

I : I I I I I I I I I I I : I : I I I I I I I I I I I I I I : I : I I I : I I : I I I I : : I I I 
Db 540 QITVLLGHNGAGKTTTLSMLTGLFPPTSGRAYISGYEISQDMVQIRKSLGLCPQHDILFD 599 

Qy 1079 RLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAF 1138 

Ml 111:11 :: II I : : I : : I M :l : I I I I I : I I I I : II 

Db 600 NLTVAEHLYFYAQLKGLSRQKCPEEVKQMLHIIGLEDKWNSRSRFLSGGMRRKLSIGIAL 659 

Qy 1139 VGGSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRT I LLSTHHMDEADLLGDRIAI I SHG 1198 

: ||: : I I I I I I : I : I MINIM: : | | | | : I : I I I I I I I I I I I I I I I : : I 
Db 660 IAGSKVXILDEPTSGMDAISRRAIWDLLQRQKSDRTIVLTTHFMDEADLLGDRIAIMAKG 719 

Qy 1199 KLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFI 1258 

M M I II I I I I I I I I : I I I I I I : : I I : 

Db 720 ELQCCGSSLFLKQKYGAGYHMTLVKEP HCNPEDISQLV 757 

Qy 1259 RKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEV 1318 

||: M I II Ml I I : : I I I I I I : I : M I I I I M I I 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



758 HHHVPNATLES SAGAELS FI LPRESTHR — FEGLFAKLEKKQKELGI AS FGAS ITTMEEV 815 

1319 FLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASS 1378 
||:| : | :| |:: : || : I I : I I I 

816 FLRVGK LVD S SMD I QAI Q- - - L P ALQ - - YQH E RRAS DW AVD S N L 854 

1379 VGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGW-LKV 1437 
I : : I I : | | | | : I I : I I 

855 CGAMDPSDGIG ALIEEERTAV KLNTGLALHC 885 

1438 RQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHN 1497 

: I I : : I : : I I : : I : I : I I : I : I : I I I I : : I 

886 QQFWAMFLKKAAYSWREWKMVAAQVLVPLTCVTLALLAINYSSELFDDPMLRLTLGEY 943 

1498 YTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCV-LKSPANGSLGPT 1556 

III I II 

944 GRTWP FS VPGT S QLGQQ 961 

1557 LNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSL 1616 
I : 

962 LS 963 

1617 PPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITG 167 6 

I : : I : I Mill: 

964 EHLKDALQAEG QEPREVLGDL 984 

1677 HNVSEYLLFTSDRFRLHRYGAITFG— NVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNN 1734 

I : I : I : :: I I : I I I I : I : I I 

985 EEFLIFRA SVEGGGFNERCLVAASF RDVGERTWNALFNN 1024 



1735 



TDV 1793 



KGYHSMPT YLNS LNNAI LRANLP KS KGN PAAYGI TVTNH PMNKT SAS LS LD YLLQG 

: I I I II : : I : I II I hi I : : : = I -I 

1025 QAYH S P AT ALAWDN L L F KLLCGPHA-SIVVSNFPQPRSALQAAKDQFNEGRKGF 1078 

1794 VIAIFIIVAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPAT 1853 

||: : : I I : I : : : I : I : I : : : I I I : I I I I I : : I I : : I I : : : : I : I : 
1079 DI ALNLLFAMAFLAST FS I LAVS ERAVQAKHVQFVS GVHVAS FWLSALLWDLI S FLI P S L 1138 

1854 CCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINL 1913 

: : : I I : I : I : I I I I I I I : I I : I I : I : I : : I I I = h 
1139 LLLWFKAFDVRAFTRDGHMADTLLLLLLYGWAIIPLMYLMNFFFLGAATAYTRLTIFNI 1198 

1914 FIGITATVATFLLQLFEHDKDLKV — VNSYLKSCFLI FPNYNLGHGLMEMAYNEY 1966 

II INI: : I : : : I I I : I I : I I • I I 

1199 LSGI ATFLMVTIMRIPAVKLEELSKTLDHVFLVLPNHCLGMAVSSF-YENYETRRY 1253 

1967 INEYYAKIGQFDKMKSPFEWDI— VTRGLVAMAVEGWGFLLTIMCQYNFLRRPQ 2019 

: : I I :: : I I I : :|| I .:| : : I hi : 

1254 CT S S EVAAH YCKKYN I QYQENFYAWSAPGVGRFVASMAASGCAYLI LLFLI ETNLLQRLR 1313 

2020 RMPVSTKPVEDDVDVASERQRVLRGDADNDM VKIENLTKV 2059 

I I I I : : I I I I I I I : I I : : : I : I : I I 
1314 GILCALRRRRTLTELYTRMPV LPEDQDVADERTRILAPSPDSLLHTPLIIKELSKV 1369 

2060 YKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSV 2119 

| : I : I I I I I I I I : I I I II II I I I I I I I : I I I I I I I : I I I hill II : 

1370 YEQRV— PLLAVDRLSLAVQKGECFGLLGFNGAGKTTTFKMLTGEESLTSGDAFVGGHRI 1427 



Qy 2120 LKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKY 2179 

:: : I : I : I I I I I I I I I : I I I I : I I I I I I : I : I II : 

Db 1428 S S DVGKVRQ RIGYCPQF DAL L DHMT GREMLVMYARL RG I P E RH I GAC VEN TLRGLLLEPH 1487 

Qy 2180 ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSV 2239 

1:1 I I I I I I I I I I I I I I I I I II I I I I I I : I I I I I III II: : : : I : : : 
Db . 1488 ANKLVRTYSGGNKRKLSTGIALIGEPAVI FLDEPSTGMDPVARRLLWDTVARARESGKAI 1547 

Qy 2240 VLTSHSMEECEALCTRIAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS SQSVKDV 2296 

:: I I I I I I I I I I I I I I I I I I I I :: I I I I I I I I :: I I I I : : : I : : : 
Db 1548 IITSHSMEECEALCTRLAIMVQGQFKCLGSPQHLKSKFGSGYSLRAKVQSEGQQEALEEF 1607 

Qy 22 97 VRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDN 2356 

I : | | ■ : : | : : I III : I I : I I : I : I : : I I I I I I : I : 

Db 1608 KAFVDLTFPGSVLEDEHQGMVHYHLPGRDLSWAKVFGILEKAKEKYGVDDYSVSQISLEQ 1667 

Qy 2357 VFVNFAKKQSDNLEQ 2371 

I I :: I I I I : 

Db 1668 VFLSFAHLQPPTAEE 1682 



RESULT 8 

US-08-762-500-75 

Sequence 75, Application US/08762500 
Patent No. 6030806 
GENERAL INFORMATION: 

APPLICANT: Landes, Gregory M. 
APPLICANT: Burn, Timothy C. 
APPLICANT: Connors, Timothy D. 
APPLICANT: Dackowski, William R. 
APPLICANT: Van Raay, Terence J. 
APPLICANT: Klinger, Katherine W. 

TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, 

TITLE OF INVENTION: COMPOSITIONS, METHODS OF MAKING AND USING SAME 
NUMBER OF SEQUENCES: 83 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: GENZYME CORPORATION 
STREET: One Mountain Road 
CITY: Framingham 
STATE: Massachusetts 
COUNTRY: United States of America 
ZIP: 01701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE : Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/762 , 500 
FILING DATE: 09-DEC-1996 
CLASSIFICATION: 4 35 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/665,259 
FILING DATE: 17-JUN-1996 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/US96/10469 



FILING DATE: 17-JUN-1996 
ATTORNEY/ AGENT INFORMATION: 
; NAME: Dugan, Deborah A. 

REGISTRATION NUMBER: 37 , 315 
REFERENCE/ DOCKET NUMBER: IG5-9.3 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (508) 872-8400 
TELEFAX: (508) 872-5415 
INFORMATION FOR SEQ ID NO: 75: 
SEQUENCE CHARACTERISTICS:- 
LENGTH: 1704 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-762-500-75 



Query Match 20.7%; Score 2622; DB 3; Length 1704; 

Best Local Similarity 34.0%; Pred. No. 2.2e-224; 

Matches 638; Conservative 317; Mismatches 556; Indels 364; Gaps 45; 

Qy 581 KGFPDEESIVNYTLNQAYQDNV— TVFASVIFQ TRKDGSLPPHVHYKIR 627 

: I M I : : I II : I I : I : I : II I I : I 

Db 108 RGFPSEKDFEDY- 1 RYDNCS S SVLAAVVFEHPFNHSKEPLPLAVKYHLRFS YTRRNY 163 

Qy 628 QNSSFTEK TNEI RRAYWRPG PNTGGRFYFLYGFVWIQDMMERAI I 672 

| | | | I : : I I I : I I I I : : I :: I I I : 

Db 164 MWTQTGSFFLKETEGWHTTSLFPLFPNPGPRELTSPDGGEPGYIREGFLAVQHAVDRAIM 223 

Qy 673 DTFVGHDWEPGSY VQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAMTIQH 72 6 

: | : : : I I I I : I I I I : : : I I : : : I : I : s 

Db 224 EYHA— DAATRQLFQRLTVTIKRFPYPPFIADPFLVAIQYQLPLLLLLSFTYTALTIARA 281 

Qy 727 IVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTT^LTAIL KYGQVLMHS 781 

:| III INI I: III:: : II III: |: I I: : :| : II I 

Db 282 WQEKERRLKEYMRMMGLSSWLHWSAWFLLFFLFLLIAAS FMTLLFCVKVKPNVAVLSRS 341 

Qy 782 HWIIWLFIAWAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHD 841 

: : || : I : : I I I I : I I : I I I : I : I I I : I I : I : I I : I I I ' 
Db 342 DPSLVLAFLLCFAISTISFSFMVSTFFSKANMAAAFGGFLYFFTYIPYFFVAPR YN 397 

Qy 842 KITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVE-GDDFNLLLAVTML 900 

: I : I : I : I I : I : : :| 1 = 1111 III III : II 

Db 398 WMTLSQKLCSCLLSNVAMAMGAQLIGKFEAKGMGIQWRDL-LSPVNVDDDFCFGQVLGML 456 

Qy 901 MVT)AWYGILTWYIEAVliPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWAR 960 

: : I : I : I I : : I I I : I I I I I : I : I : I I II : III I I I = 

Db .457 LLDSVLYGLVTWYMEAVFPGQFGVPQPWYFFIMPSYWCGKPRAVAGK 503 

Qy 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYK--DDKKLALNKLSLNLYEN 1018 

||:: : : I I I II I : : I : I I : : : : I : I : I I I I I . 

Db 504 -EEEDSDPEKALRNEY FEAEPEDLVAGIKIKHLSKVFRVGNKDRAAVRDLNLNLYEG 559 

Qy 1019 QWS FLGHNGAGKTTTMS I LTGLFP PTS GSAT I YGHDI RTEMDEI RKNLGMCPQHNVLFD 1078 

I: t I I II I I I I I I: 1:1 I I I I I I I I I I I |::| : I : I I I : I I : I I I I : : I I I 
Db 560 QITVLLGHNGAGKTTTLSMLTGLFPPTSGRAYISGYEISQDMVQIRKSLGLCPQHDILFD 619 

Qy 1079 RLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAF 1138 



I I I I I I : I I : : I I : : : : : I : : I : : I : I : I : I I I I I : I I I I : I I 
620 NLTVAEHLYFYAQLKGLSRQKCPEEVKQMLHIIGLEDKWNSRSRFLSGGMRRKLSIGIAL 679 

1139 VGGSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHG 1198 

: ||: : I I I I I I : I : I : I I I I I I I : : I I I I : I : I I 1111111111111 = : I 
680 IAGSKVLILDEPTSGMDAISRRAIWDLLQRQKSDRTIVLTTHFMDEADLLGDRIAIMAKG 739 

1199 KLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFI 1258 

: I : I I I I I I I I I I I I : I I I I I I : Ml : 

740 ELQCCGSSLFLKQKYGAGYHMTLVKEP HCNPEDISQLV 777 

1259 RKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEV 1318 

|| : || Mhllll: : MM I I : I —Ml IIMII 

778 HHHVPNATLESSAGAELSFILPRESTHR— FEGLFAKLEKKQKELGIASFGASITTMEEV 835 

1319 FLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASS 137 8 

I I : I : I : I I :: : II : I I : I I I 
836 FLRVGK LVDSSMDIQAIQ LPALQ— YQHERRASDWAVDSNL 874 

1379 VGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGW-LKV 1437 
| : : | | Mill: I I : I I 

875 CGAMDPSDGIG ALIEEERTAV KLNTGLALHC 905 

1438 RQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHN 14 97 

: | | : : I : : I I : : I M : I I : I : I : I I I I : ' I 

906 QQFWAMFLKKAAYSWREWKMVAAQVLVPLTCVTLALIAINYSSELFDDPM — 963 

1498 YTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCV-LKSPANGSLGPT 1556 

III I II 

964 GRTWPFSVPGTSQLGQQ 981 



1557 LNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSL 1616 
I: 

982 LS 



983 



1617 PPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITG 1676 

I :: I :| : I I I I : 
994 EH LKDALQAEG QEPREVLGDL 1004 

1677 HNVSEYLLFT S DRFRLHRYGAI T FG — NVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNN 1734 

I : I : I : : : I I : I I I I : I Ml 

1005 EEFLIFRA SVEGGGFNERCLVAASF RDVGERTWNALFNN 1044 

1735 KGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQG-TDV 17 93 

: I I I || : : | : I II I I M I : : : : I M 

1045 QAYHSPATALAWDNLLF KLLCGPHA-SIWSNFPQPRSALQAAKDQFNEGRKGF 1098 

1794 VIAIFI IVAMSFVPAS EVVFLVAEKSTKJU^HLQFVSGCNPII YWLANYVWDMLNYLVPAT 1853 

I I : : : I I : I : :M : I M : : M I I M M I I = MM : I I :: : M M : 
1099 D I ALN L L FAMAFLAS T FS I LAVS ERAVQAKHVQ FVS GVHVAS FWL S ALLWDL I S FL I P S L 1158 

1854 CCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINL 1913 

: : : I I : I M : I I I I I I I M I M I MM : M I I : I : 
1159 LLLWFKAFDVRAFTRDGHMADTLLLLLLYGWAIIPLMYLMNFFFLGAATAYTRLTIFNI 1218 

1914 FIGITATVATFLLQLFEHDKDLKV— VNSYLKSCFLIFPNYNLGHGLMEMAYNEY 1966 

II MM: : | : : : I II : I I : I I : I I 



Db 



1219 LSGI ATFLMVTIMRIPAVKLEELSKTLDHVFLVLPNHCLGMAVSSF-YENYETRRY 1273 



Qy 1967 INEYYAKIGQFDKMKSPFEWDI--VTRGLVAMAVEGWGFLLTIMCQYNFLRRPQ 2019 

: : I I : : : I I I : : I I I : | : : I I : I : 

Db 1274 CTSSEVAAHYCKKYNIQYQENFYAWSAPGVGRFVASMAASGCAYLILLFLIETNLLQRLR 1333 

Qy 2020 RMPVSTKPVEDDVDVASERQRVLRGDADNDM— VKIENLTKV 2 059 

I I I I : : I I I I I I I : I I : : : I : I*: I I 

Db 1334 GILCALRRRRTLTELYTRMPV LPEDQDVADERTRILAPSPDSLLHTPLIIKELSKV 1389 

Qy 2060 YKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSV 2119 

I : I : I I I I I I I I : I I I I I I I I I I I I I I : I I I I I I I : I I I hill II • 

Db 1390 YEQRV— PLLAVDRLSLAVQKGECFGLLGFNGAGKTTTFKMLTGEESLTSGDAFVGGHRI 1447 

Qy 2120 LKELLQVQQSLGYCPQCDALFDELT7VREHLQLYTRLRGISWKDEARVVPCWALEKLELTKY 2179 

::: I : I : I I I I I I I I I : I I I I = I I I I I I h I I I : 

Db 1448 S SDVGKVRQRI GYCPQFDALLDHMTGREMLVMYARLRGI PERHI GACVENTLRGLLLEPH 1507 

Qy 2180 ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSV 2239 

I : I I I I I I I I I I I I I I I I II II I I I I I h I I I I I III II: • : : h : : 

Db 1508 ANKLVRTYSGGNKRKLSTGIALIGEPAVIFLDEPSTGMDPVARRLLWDTVARARESGKAI 1567 

Qy 2240 VLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS SQSVKDV 2296 

: : I I I I I I I I I I I I I I I I I I I h : I I I I I I I h : I I II : : : I 

Db 1568 IITSHSMEECEALCTRLAIMVQGQFKCLGSPQHLKSKFGSGYSLRAKVQSEGQQEALEEF 1627 

Qy 22 97 VRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDN 2356 

| : | | : : | : : | III : I h I I : h h : I I I I I I : h 

Db 1628 KAFVDLTFPGSVLEDEHQGMVHYHLPGRDLSW7VKVFGILEKAKEKYGVDDYSVSQISLEQ 1687 

Qy 2357 VFVNFAKKQSDNLEQ 2371 

I I : : I I I I: 
Db 1688 VFLSFAHLQPPTAEE 1702 



RESULT 9 

US-09-328-352-7592 

Sequence 7592, Application US/09328352 
Patent No. 6562958 
GENERAL INFORMATION: 
APPLICANT: Gary L. Breton et al . 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

TITLE OF INVENTION: BAUMANNI I FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: GTC99-03PA 

CURRENT APPLICATION NUMBER: US/ 09/ 32 8 , 352 
CURRENT FILING DATE: 1999-06-04 
NUMBER OF SEQ ID NOS : 8252 
SEQ ID NO 7592 
LENGTH: 589 
TYPE: PRT 

ORGANISM: Acinetobacter baumannii 
US-09-328-352-7592 

Query Match 2.9%; Score 363.5; DB 4; Length 589; 

Best Local Similarity 31.4%; Pred. No. 6.7e-23; 

Matches 102; Conservative 59; Mismatches 131; Indels 33; Gaps 7; 



Qy 


1953 


NLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIWRGLVAMAVEGWGFLLTIMCQY 
||:: 1 : 1 1 : II MM : : :: 1 
NSGEI 1 DAVPRGEQVN FI SRQKELSTDIFPLGLVANRRPPELEDVFMMLLQQ 


2012 


Db 


265 


316 


Qy 


2013 


N FL RR P Q RMP VS T K P VE D D VD VAS E RQ RVL RG D ADN DMVK I EN LT KVYK S RK I GRI LAVD 
| : | : : : | : : : 1 : : 1 : : : : : I : : 1 II 
N QKQQI SIS QQAFRS EQNNN S Q S DQAVI WKDLVRTF GDFTAVA 


2072 


Db 


317 


360 


Qy 


2073 


RLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGY 

1 : II II II 1 1 1 II 1 1 : 1 1: II 1 : 1 1 1 : : : : Ml 

NTSFTVQRGEIFGLLGPNGAGKTTTFRMLCGLLPASSGYLEVAGKNLRTARAEARAKVGY 


2132 


Db 


361 


420 


Qy 


2133 


CPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKP AGTYSG 

I ||: || 1 M : : 1 M 1 : : I I : : : 1 , II M 1 
VSQKFALYSNLTVLENLKFFGGAYGLSGKKLDQQIDKALQQYDL KPQIKSGDLPG 


2189 


Db 


421 


475 


Qy 


2190 


r*HTT/nyT ct^a T 7\ T TrVDTMTT rpT r\TP D r P r P rMfl D T«r ZX DD T7 1 T WNTT T T TlT T KTC^P ^ WT TSH^MFFP 
GNKKKLb 1 A 1 A 1 1 1 Ct Y FA r Xr-bUlLrl 1 UlnJJr J\i\KKr J_iW1M J_jX J_iJJ1j± Ixlurrw v vi»l ononEjij^ 

I | :: | | | 1 1 : 1 M 1 1 1 1 1 M : 1 1 III 1 l-l 1 :::: 1 M M 1 
GYKQRLSMAAAiLHEPEILFLDEPTSGIDPLARRLFWYSIGKLANQGITIIITTHFMEEA 


2249 


Db 


476 


535 


Qy 


2250 


EALCTRLAIMVNGRLRCLGSIQHLK 2274 

1 1 1 : 1 1 1 : 1 INI:: 
E-YCDRIAIQDAGKLLALGSPQQVR 559 




Db 


536 





RESULT 10 

US-09-4 89-039A-10626 

; Sequence 10626, Application US/09489039A 

; Patent No. 6610836 

; GENERAL INFORMATION: 

; APPLICANT: Gary Breton et . al 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
KLEBSIELLA 

; TITLE OF INVENTION: PNEUMONIAE FOR DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 2709.2004001 

; CURRENT APPLICATION NUMBER: US/09/4 89, 039A 

; CURRENT FILING DATE: 2000-01-27 

; PRIOR APPLICATION NUMBER: US 60/117,747 

; PRIOR FILING DATE: 1999-01-29 

; NUMBER OF SEQ ID NOS : 14342 

; SEQ ID NO 10626 

LENGTH: 317 

TYPE: PRT 
; ORGANISM: Klebsiella pneumoniae 
US-09-489-039A- 1.0626 

Query Match 2.8%; Score 360; DB 4; Length 317; 

Best Local Similarity 29.7%; Pred. No. 4.1e-23; 

Matches 101; Conservative 67; Mismatches 112; Indels 60; Gaps 11; 

Qy 991 VCVD — KLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGS 104 8 

: | : | ||:: : | I : : : : I I I I I : M I I : : : : M III 

Db 16 LCIDVKNLNKHFGEHH— WKDFSLQVAKGEIYGFLGPNGSGKTTSIRMMCGLITPDSGE 73 

Qy 1049 ATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMI 1108 

I | | | | : : : I : I : I I : : : I I : I : I I M I I : : I : : : : 



Db 


74 


GTCLGMDI FTQREKI KKKI GYMTQYFSMWGNLTI RENLLFI ARLYSL- - DRRRERVERAL 


131 


Qy 


1109 


EDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILK 

: I I : :: 1 1 : 1 1 1 1 1 : :: : : 1 : : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : = = 
SELGLTARQHQLAKELSGGWKQRMALAACMLHEPVLLFLDEPTAGVDPKARR 


1168 


Db 


132 


191 


Qy 


1169 


YKP-GRTILLSTHHMDEADLLGDRIAIISHGKLKCCGS PLFLKGTYGDGYRL 

i i i ■ i i i i i .i.i.i.i i. i iii i 


1219 


Db 


192 


I : : 1 : 1 1 1 : 1 1 1 1 : : : | : | : | : | | : 1 III 1 

LSDRGISLLVSTHYMDEAERC-HKVAYLSYGRLLANGTIASIIASQNLITMRTSGAG — L 


248 


Qy 


1220 


TLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYIL 

II:: 1 : II 1 : 1 : 1 
TLLE SQLQRLPDIEQTVI FGNQLYITS 


1279 


Db 


249 


275 



Qy 1280 PSEAAKKGAFERLF.QHLERSLDALHLSSFGLMDTTLEEVF 1319 

I I I I II : : : I : I I I I : I 

Db 276 RDEAKLKSA LFAFTQQGYE FCKVDTNLEDAF 306 



RESULT 11 

US-09-252-991A-31957 

Sequence 31957, Application US/09252991A 
Patent No. 6551795 
GENERAL INFORMATION: 
APPLICANT: Marc J. Rubenfield et al. 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 107196.136 

CURRENT APPLICATION NUMBER: US/09/252 , 991A 
CURRENT FILING DATE: 1999-02-18 
PRIOR APPLICATION NUMBER: US 60/074,788 
PRIOR FILING DATE: 1998-02-18 
PRIOR APPLICATION NUMBER: US 60/094,190 
PRIOR FILING DATE: 1998-07-27 
NUMBER OF SEQ ID NOS : 33142 
SEQ ID NO 31957 
LENGTH: 345 
TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-31957 

Query Match 2.8%; Score 359.5; DB 4; Length 345; 

Best Local Similarity 27.0%; Pred. No. 5.4e-23; 

Matches 87; Conservative 78; Mismatches 134; Indels 23; Gaps 7; 

Qy 990 VVCVT)KLTKVYKDDKKLALNKLSLNLYENQVVSFLGHNGAGKTTTMSILTGLFPPTSGSA 1049 

: : : I : I : I : : : I I I : : : f I I I I I I I : I I : : I : I I : : I I 
Db 38 MI DI DRLSKRFSG- -RTWNDLS FRI DRGEIVGLLGPNGAGKSTTLKMLSGFLAPSAGSV 95 

Qy 1050 TIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIE 1109 

I : I | : : : : : I : I I : : : : I I I I I : : : : I I I : I : : 
Db 96 RIFGFDMQDKARQAQKLIGYLPENAPSYGEMTVEGFLAFVASIRDYSGREKRRRIDSAMD 155 

Qy 1110 DLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKY 1169 

: I I : : I I : : : I I I I I I : : : : I I : : : I I I I I I : I I : : I : 

Db 156 CMELRDERRSIIETLSKGFKRRVALAQAILHDPELLLLDEPTDGLDPNQKHQVRQLVKNL 215 



Qy 1170 KPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSP — LFLKGTYGDGYRLTLVKRPAE 1227 

: : : : I I I : : I : I : I : I : I : I I : I | : : : I : 

Db 216 SESKIWISTHILEEVSFMCSRALVINGGRLLADNTPGELRTRSRYHHAVSLS-IEAPVD 274 

Qy 1228 PGG PQEPGLASSPPGRAPLSSCSE — LQVSQFIRKHV-ASCLLVSDTSTELSYILP 1280 

I I I : I I : : : I : : : : I II II 
Db 275 PLAIAMLPGVAGIEGRPDRAGTLTILARPGVQILPALNRLIHGSGWRVSGVRTE 328 

Qy 1281 SEAAKKGAFERLFQHLERSLDA 1302 

I I : I : I I I 
Db 329 HGQLEEVFRQLTRET PA 345 



RESULT 12 

US-09-543-681A-4646 

Sequence 4646, Application US/09543681A 
Patent No. 6605709 
GENERAL INFORMATION: 
APPLICANT: GARY BRETON 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO PROTEUS 
MIRABILIS FOR 

TITLE OF INVENTION: DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 27 09.1002-001 
CURRENT APPLICATION NUMBER: US/09/543, 681A 
CURRENT FILING DATE: 2000-04-05 
PRIOR APPLICATION NUMBER: US 60/128,706 
PRIOR FILING DATE: 1999-04-09 
NUMBER OF SEQ ID NOS : 8344 
SEQ ID NO 4646 
LENGTH: 532 
TYPE: PRT 

ORGANISM: Proteus mirabilis 
US-09-543-681A-4646 

Query Match 2.7%; Score 343.5; DB 4; Length 532; 

Best Local Similarity 30.6%; Pred. No. 3.4e-21; 

Matches 91; Conservative 60; Mismatches 105; Indels 41; Gaps 8; 

Qy 1984 FEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVA 2035 

I : I I I I I I I : : I : : : I 

Db 223 FDW LVAMDAGKV ; -LATGHAEELKAQTATDELEAAFIELLPEE 263 

Qy 2036 — SERQRVL RGDADNDMVKIE--NLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLG 2088 

I : I : I : I : I : : I I II : I : : I I I : : I I I I I I 

Db 264 KRKNHQKVI I PPRDKSDDDI IAI EAKELT MRFGQFVAVDHVSFRIPKGEIFGFLG 318 

Qy 2089 VNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREH 214 8 

I I I I :: I I I I II : I I : : I I I I I : I : I I I I : : 

Db 319 SNGCGKSTTMKMLTGLLEASEGRAWLFGQEVDPKDIETRKRVGYMSQAFSLYSELTVRQN 378 

Qy 2149 LQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFI 2208 

I : I : : I I ■: I : : I I : II I I : : : I I I : I : I I : 

Db 379 LELHAKLFHIPEQDIPQRIKEMSERFNLTDVEDMMPDGLPLGIRQRLSLAVAVIHKPEML 438 

Qy 2209 FLDEPTTGMDPKARRFLWNLILDLI-KTGRSWLTSHSMEECEALCTRLAIMVNGRL 2264 

I I II I : I : I I I I I I : : I I : I : : : : : I I I I I I : : : I I : : 



Db 



439 ILDEPTSGVDPIARDMFWKLMVDLSRRDGWIFISTHFMNEA-ARCDRMSLMHAGKV 494 



RESULT 13 

US-09-252-991A-28171 

; Sequence 28171, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 28171 

LENGTH: 7 88 

TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-2 8171 



Query Match 2.7%; Score 341.5; DB 4; Length 788; 

Best Local Similarity 33.6%; Pred. No. l.le-20; 

Matches 95; Conservative 45; Mismatches 110; Indels 33; Gaps 6; 

Qy 2016 RRPQ— RMPVSTKPVEDDVDVA- SERQRVLRGDADN DMVKI 2053 

III : I I I I I : I I I I I : I : I 

Db 431 RRPADLAVPAGTPAV^QPEHRAAALGPARLPGGALARRARQPSADADARRGAEA 490 

Qy 2054 ENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAF 2113 

: I I I : I : I I : I I I I I I I I I I I I I : I I : : I : : I 

Db 491 DGATLRY GALTALS GLDLRLEPGEVLGLLGHNGAGKTTT I KLVLGLLAP S EGRVR 545 



Qy 2114 VNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEK 2173 

III | : : : | M I : : : I : I I : : I I : I : : : I I : I I : 
Db 546 VLGHDA — RSLEARRQLGYLPENVTFYPQLSGAETLRHFARLKGVAPAEAARL LEQ 599 

Qy 2174 LELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLI 2233 

: I I : I I I I : : : I I I I : I I : I I I I I I : I I I I : I : I 
Db 600 VGLGHAARRRLKTYSKGMRQRLGLAQALLGEPRLLLLDEPTVGLDPLATVELYQLLDRLR 659 

Qy 2234 KTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNR 227 6 

I : I I I I : I I I I : I I I : I I : I : : 
Db 660 GQGTGI VLCSHVLPGVETHI DRAAI LAGGRLQVAGSLAELRRK 702 



RESULT 14 

US-09-134-000C-6449 

; Sequence 6449, Application US/09134000C 
; Patent No. 6617156 
; GENERAL INFORMATION: 

; APPLICANT: Lynn Doucette-Stamm et al 



; TITLE OF INVENTION : NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 

; TITLE OF INVENTION: ENTEROCOCCUS FAECAL IS FOR DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 032796-032 

; CURRENT APPLICATION NUMBER: US/09/134 , 000C 

; CURRENT FILING DATE: 1998-08-13 

; PRIOR APPLICATION NUMBER: US 60/055,778 

; PRIOR FILING DATE: 1997-08-15 

; NUMBER OF SEQ ID NOS: 6812 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 6449 

LENGTH: 315 

; TYPE: PRT 

; ORGANISM: Enterococcus faecalis 
US-09-134-OOOC-6449 

Query Match 2.7%; Score 339.5; DB 4; Length 315; 

Best Local Similarity 28.8%; Pred. No. 2.8e-21; 

Matches 99; Conservative 61; Mismatches 137; Indels 1 47; Gaps 7; 

Qy 993 VDKLTKVYKDDKKLALNKLSLNLYENQWS FLGHNGAGKTTTMSILTGLFPPTSGSATIY 1052 

: I I I I : II : I : I : I I N I I I : I I : I : I I Ml I : 

Db 11 IQDLRKVYASGVE-ALRGIDLTVTlEGDFYALLGPNGAGKSTTIGIvTSLWKTSGKVKIF 69 

Qy 1053 GHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLE 1112 

I : I : I I I : : : I : I I II::: : .: : : : I : : I : : 

Db 70 GYDLDTEMVmKQQIGLVPQEFNFNPFETVQQIVVNQAGYYGVSRKEAMKRSEKYLKQSN 129 

Qy 1113 LSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYK-P 1171 

I II: : I I I I I I I : I : I I : : : I I I I I I I I I I II : I : : 

Db 130 LWEKRNERARMLSGGMKRRLMIARALMHEPKLLILDEPTAGVDIELRREMWAFLQELNAQ 189 

Qy 1172 GRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGS PLFLKGT YGDGYRLTLVKRPAEPGGP 1231 

I I I : I : I I : : : I I : : I III I : I I : I 
Db 190 GTTIILTTHYLEEAEMLCRNIGIIQSGEL IENTSMKH 226 

Qy 1232 QEPGLASSPPGRAPLSSCSELQVSQFI RKHVASCLLVSDTST-ELSYILPSEAAKKG 1287 

: : I I II : : : : II I I : 

Db 227 LLAKLQFETFIFDLAPYTQAPVIEGYQSVFEDELTLAVEVERNQ 270 

Qy 1288 AFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLEN 1331 

I I : I I : : I I I I : I I I : : I : . I : 

Db 271 GVNHLFEQL— SQQGIKVLSMRNKSNRLEELFLKITEDTYQRED 312 



RESULT 15 

US-09-252-991A-18351 

; Sequence 18351, Application US/09252991A 
; Patent No.' 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,7 88 



PRIOR FILING DATE: 1998-02-18 
PRIOR APPLICATION NUMBER: US 60/094,190 
PRIOR FILING DATE: 1998-07-27 
NUMBER OF SEQ ID NOS : 33142 
SEQ ID NO 18351 
LENGTH: 607 
TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-18351 

Query Match 2.7%; Score 338.5; DB 4; Length 607; 

Best Local Similarity 15.4%; Pred. No. 1.2e-20; 

Matches 198; Conservative 105; Mismatches 235; Indels 747; Gaps 21; 

Qy 991 VCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSAT 1050 

I : : : I : I I II II : : : : : I : I I I I I I - I I I I I I I : I ' I 
Db 37 WIEDVDKHFGDVK — ALRGLSARIHYGRLTGLVGPDGAGKTTLMRILTGLLVPNAGRVT 94 

Qy 1051 I YGHDI RTEMDEI RKNLGMCPQHNVLFDRLTVEEHLWFYS RLKSMAQEEI RREMDKMI ED 1110 

: I : I : .: I I I II I : : I : I I : : I : : I : I : : : : : 

Db 95 LAGYDWKDNDAIHVASGYMPQRFGLYEDLSVMENMRLYAQLRGMDADRNAELFAELLDF 154 

Qy 1111 LELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKY- 1169 

I 1 I I I I I I : I I : I I : : : : I I I I I I I I : I : : I : : 

Db 155 TRLGPFTKRIAGKLSGGMKQKLGLACALMARPKVLLLDEPGVGVDPVSRQDLWRMVQALT 214 

Qy 1170 KPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPG 1229 

I : : I I : : I I I : : I 

Db 215 DEGMAWWSTAYLDEAE RC 233 

Qy 1230 GPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAF 1289 

Db 234 233 

Qy 1290 ERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEG 1349 

III 

Db 234 ESVLL 238 

Qy 1350 PASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQDPDN 14 09 

III I I I I 
Db 239 LNQGQL LFDGPP 250 

Qy 1410 VSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFV 1469 

I I : I : | : | | : : I I 
Db 251 QELTAQ LEGRSFR LENVGAERR 272 

Qy 1470 CVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSPDASPQQL 1529 

: I I I : I : I 

Db 273 -AVLTEALDLPSVSD 286 

Qy 1530 VSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNF 158 9 

: I I I I : I 

Db 287 GVI Q GAGVRWL REGA 302 

Qy 1590 VPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTG 1649 

11:11 

Db 303 PTEQIQA 309 



Qy 1650 FSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAITFGNVLKSIPA 1709 

: I I : : I : I I 

Db 310 LADRAQVQ LAPVPA 323 

Qy 1710 SFGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGIT 1769 

I I : I M. 
Db 324 RF EDAFIDLLG GGP 337 

Qy 1770 VTNH PMNKT S AS L S LD YLLQGT DWIAI FI I VAMS FVP AS FWFLVAEKS T KAKH LQ FVS 1829 

Db 338 337 

Qy 1830 GCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITP 1889 

Db " 338 337 

Qy 18 90 IMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIF 194 9 

I 1:1 : I I 

Db 338 GGTSTLAERL 347 

Qy 1950 PNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIM 2009 

Db 348 ■ ~- 347 

Qy 2010 CQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRIL 2069 

III III I MM : I 

Db 348 SPVELGSDVA VSCRNLTK RFGEFT 371 

Qy 2070 AVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQS 2129 

||:: I : II I II I I I I I I I : : I I II I I I M I I M : : 

D b 372 ATDQVSFEVQKGEIFGLLGPNGAGKSTTFKMLCGLLKPTAGEAHWGHDLRHATGAAKSQ 4 31 

Qy 2130 LGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSG 218 9 

Ml I : I • I : I : : I : : I : : : : M : : 

Db 432 LGYMAQKFSLYGLLSVRQNLEFSAGVYGLEGNVRRERIEEMIATFDLGDWLSATPDSLPL 4 91 

Qy 2190 GNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEEC 224 9 

| : | : : | : I : I : I : II I I I I : I : M II I I I : I : : : : I : I I : I 
Db 492 GHKQRLALACSLMHRPPVLFLDEPTSGVDPITR 551 

Qy 2250 EALCTRLAIMVNGRLRCLGSIQHLK 2274 

I I I : I : : III: II 
Db 552 E-YCDRVAMLSRARLIALDTPDALK 575 



Search completed: September 1, 2004, 10:58:55 
Job time : 59 sees 



GenCore version 5.1.6 
Copyright (c) 1993-2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: September 1, 2004, 10:46:22 ; Search time 68 Seconds 

(without alignments) 
3445.920 Million cell updates/sec 



Title: 'US-10-088-467-2 
Perfect score: 12668 

Sequence: 1 MGFLHQLQLLLWKNVTLKRR GLISFEEERAQLSFNTDTLC 2436 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 283366 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database 



PIR It 



pirl : * 
pir2:* 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 
No. 



Score 



% 

Query 

Match Length DB 



ID 



Description 



1 


7979 


63. 


0 


1529 


2 


A59189 


ATP-binding casset 


2 


7119 


56. 


2 


1472 


2 


B54774 


ATP binding casset 


3 


4103 


32. 


4 


2201 


2 


A54774 


ATP binding casset 


4 


2622 


20. 


7 


1704 


2 


S71363 


probable ATP-bindi 


5 


2622 


20. 


7 


1704 


2 


A59188 


ATP-binding casset 


6 


2024 


16. 


0 


1802 


2 


T33783 


hypothetical prote 


7 


1964.5 


15. 


5 


1816 


2 


A84845 


probable ABC trans 


8 


1920 


15. 


2 


373 


2 


T47150 


hypothetical prote 


9 


1718.5 


13. 


6 


1447 


2 


T15200 


hypothetical prote 


10 


1688 


13. 


3 


1317 


2 


C88925 


protein F33E11.4 [ 


11 


1524 


12. 


0 


1767 


2 


S60124 


transport protein 


12 


1522.5 


12. 


0 


1758 


2 


F88559 


protein C48B4.4b [ 


13 


1515 


12. 


0 


1704 


2 


T42749 


ATP-binding casset 



1 4 
1 4 


14 4ft S 

1440 • «J 


11 . 


4 


1246 


2 


T00826 


hypothetical prote 


X 3 


J_ £ VJ £ • ~J 


9 . 


5 


1564 


2 


T27121 


hypothetical prote 


1 fy 




8 . 


3 


1431 


2 


T22748 


hypothetical prote 


1 7 


74 6 


5 . 


9 


269 


2 


T46467 


hypothetical prote 


1 ft 


707 . 5 


5 . 


6 


895 


2 


T07714 


probable ABC-type 


1 Q 


/ UO • J 


5 . 




900 


2 


T07717 


probable ABC-type 


o o 
z u 


709 


5 . 


5 


1011 


2 


T07712 


probable /ABC-type 


z ± 


6ft 6 


5 . 


4 


925 


2 


T07713 


probable ABC-type 


ZZ 


ODD 


5 . 


3 


1336 


2 


T18288 


ABC transport prot 


9 9 
Z J 


64 Q S 
04 ,7 . 3 




1 


722 


2 


T07716 


probable ABC-type 


9 A 
Z 4 


— 4 ft S 

4 O J 


9 

_) ■ 


ft 

o 


130 


2 


138906 


ATP-binding casset 


9 R 
Z 3 


4 6Q S 

4 D z> • 3 


9 

o • 


7 


664 


2 


T07715 


probable ABC-type 


9 £ 
Z D 


4 67 R 
4 O / . 3 


9 

O . 


7 


149 


2 


138905 


ATP-binding casset 


9 "7 
Z / 


4 4 9 R 
4 4 Z . 3 


9 

O . 




196 


2 


T12512 


hypothetical prote 


9ft 
Z O 


4 9 7 R 
4 ^1 / • 3 


9 

O . 


4 

4 


327 


2 


D72257 


hypothetical prote 


9 Q 

z y 


4 91 ^ 
4 Z X . 3 


9 

O . 


9 


260 


2 


T15237 


hypothetical prote 


9 n 


417 

4 J. / 


9 

J • 


9 
J 


324 


2 


C71081 


probable resistanc 


9 1 


4 ± 3 . 3 


9 


9 


350 


2 


B69065 


7\BC transporter (A 


9 9 
JZ 


/111 ^ 
4 J. J. » 3 


9 

3 . 


o 


9? ft 


2 


E75108 


daunorubicin resis 


9 Q 
99 


4 0ft 


o 

O . 


9 


99 Q 

-J O Zf 


2 


S74048 


probable daunorubi 


9 /I 


4 n ^ 

4 U 3 


9 

3 . 


9 


330 


2 


S27707 


daunorubicin resis 


9 R 

Jo 


4 U4 . 3 


9 

9 . 


9 


94 7 


2 


S76278 


ABC-type transport 


9 £^ 


/I 09 ^ 
4 UZ . 3 


9 

3 . 


9 
z 


900 


2 


AG2 116 


ABC transporter AT 


99 


4 09 


9 

o « 


9 
z. 


311 


2 


G69803 


ABC transporter (A 


9 ft 


4 09 
4 UZ 


9 

O • 


9 


333 


2 


D72492 


probable ABC trans 


9 Q 


9Qft 


9 


1 
> X 


275 


2 


D90267 


ABC transporter, A 


4 0 


395 


3 . 


. 1 


310 


2 


E96920 


ABC transporter (A 


41 


394 


3. 


,1 


325 


2 


S32908 


hypothetical prote 


42 


394 


3. 


,1 


727 


2 


T07718 


probable ABC-type 


43 


393.5 


3. 


.1 


312 


2 


C69012 


ABC transporter (A 


44 


392.5 


3. 


. 1 


297 


2 


AE1816 


ABC transporter (A 


45 


391 


3. 


. 1 


305 


2 


E75122 


hypothetical prote 
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RESULT 1 
A59189 

ATP-binding cassette transporter - human (fragment) 
N;Alternate names: KIAA1062 protein 
C; Species: Homo sapiens (man) 

C;Date: 18-Feb-2000 #sequence_revision 18-Feb-2000 #text_change 02-Jun-2000 
C;Accession: A59189 

R;Kikuno, R. ; Nagase, T . ; Ishikawa, K. ; Hirosawa, M. ; Miyajima, N . ; Tanaka, A. ; 
Kotani, H.; Nomura, N . ; Ohara, O. 
DNA Res. 6, 197-205, 1999 

A;Title: Prediction of the coding sequences of unidentified human genes. XIV. 
The complete sequences of 100 new cDNA clones from brain which code for large 
proteins in vitro. 

A; Reference number: Z22961; MUID: 99397452 ; PMID : 10470851 
A; Accession: A59189 

A; Status: preliminary; not compared with conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-1529 <KIK> 

A;Cross-references: GB:AB028985; NID: g5689460 ; PIDN : BAA83014 . 1 ; PID:dl046841; 
PID:g5689461 



A; Experimental source: chromosome 9; clone hj03579; clone lib pBluescriptll SK 

plus; tissue type brain. 

C; Genetics : 

A; Map position: 9 v 

A;Note: KIAA1062 

C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

Query Match 63.0%; Score 7979; DB 2; Length 1529; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1529; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

GILTWYIEAV^PGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQAC 967 

I | | | I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I \ I I I I I M I M I M 

GILTWYIEAVliPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQAC 60 

AMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHN 1027 
| | | | | | I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHN 120 

GAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVXFDRLTVEEHLW 1087 

| | | | | | I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

GAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLW 180 

FYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIIL 1147 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

FYS RLKSMAQEEI RREMDKMI EDLELSNKRHS LVQTLS GGMKRKLS VAI AFVGGS RAI I L 2 4 0 

DEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGS PL 1207 
| | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
DEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGS PL 3 0 0 

FLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLL 1267 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

FLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLL 360 

VSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQ 1327 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I II II 

VSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQ 420 

S LEN S EADVKE S RKDVLP GAEGP AS GEGHAGN LARC S ELTQ S QAS LQ S AS S VG S ARGDEG 1387 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I I I I I 

SLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEG 480 

AGYTDWGDYRPLFDNPQDPDNVSLQEVTiAEALSRV^ 1447 
| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGYTDWGDYRPLFDNPQDPD^^^SLQEVE^LEALSRVGQGSRKLDGGWLKVRQFHGLLVKR 540 

FHCARRNSK7VLFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIP 1507 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

FHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIP 600 

YANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLL 1567 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I II II I I II I 

YANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPTLNLSSGESRLL 660 



Qy 


908 


Db 


1 


Qy 


k 968 


Db 


61 


QY 


1028 


Db 


121 


Qy 


1088 


Db 


181 


Qy 


1148 


Db 


241 


Qy 


1208 


Db 


301 


Qy 


1268 


Db 


361 


Qy 


1328 


Db 


421 


Qy 


1388 


Db 


481 


Qy 


1448 


Db 


541 


Qy 


1508 


Db 


601 



Qy 



1568 AARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMJTS 1627 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 


661 


AARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWWSLPPTAGPEMWTS 


720 


Qy 


1628 


APSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTS 


1687 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 I 1 1 1 1 1 1 1 1 1 1 1 




Db 


721 


APSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTS 


780 


Qy 


1688 


DRFRLHRYGAITFGNVLKSIPAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNSL 


1747 




I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I M I 




Db 


781 


DRFRLHRYGAITFGNVLKSIPAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNSL 


840 


Qy 


1748 


NNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSFVP 


1807 




| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 




Db 


841 


NNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSFVP 


900 


Qy 


1808 


AS FWFLVAEKS T KAKH LQ FVS GCN P 1 1; YWLAN YVWDMLN YLVPAT CCVI I L FVFDL PAY 


1867 




I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


901 


ASFWFLVAEKSTKAKHLQFVSGCNPIIYWL7\NYVWDMLNYLVPATCCVIILFVFDLPAY 


960 


Qy 


1868 


T S P TN FP AVL S L FL L YGW S I T P I MY PAS FW FEVP S S AYVFL I VI N L F I G I TAT VAT FL LQ 


1927 




I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 




Db 


961 


TSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQ 


1020 


Qy 


1928 


LFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWD 


1987 




| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


1 0?1 

X \J c+ X 


LFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWD 


1080 


Qy 


1988 


IVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDAD 


2047 




I I I I I I II I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


1081 


IVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDAD 


1140 


Qy 


2048 


NDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDEST 


2107 




I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


1141 

JL X *± X 


NDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECF.GLLGVNGAGKTSTFKMLTGDEST 


1200 


Qy 


2108 


TGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARW 


2167 




I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


1201 


TGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARW 


1260 


Qy 


2168 


KWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWN 


2227 




I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


1261 


KWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWN 


1320 


Qy 


2228 


LILDLIKTGRSWLTSHSMEECE^CTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRT 


2287 




I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


1321 


LILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRT 


1380 


Qy 


2288 


KSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDY 


2347 






... | 4 | | | | | ■ 1 • I | J I 1 1 | 1 1 1 I 1 1 1 1 1 1 1 1 I | 1 1 1 I 1 1 | I 1 

| | | | | | | | | | I N l l l M l l l l l l l l l l l l I M M l l l l l l l l l l l l l l l l l l l l l l l l ^ 




Db 


1381 


KSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDY 


1440 


Qy 


2348 


SVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADE 


2407 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 I 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 




Db 


1441 


SVSQTTLDNVFVNFAKKQSDNLEQQETEPPS7VLQSPLGCLLSLLRPRSAPTELRALVADE 


1500 


Qy 


2408 


PEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 





1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 1501 PEDLDTEDEGLISFEEERAQLSFNTDTLC 1529 



RESULT 2 
B54774 

ATP binding cassette transporter ABC2 - mouse (fragment) 
C; Species: Mus musculus (house mouse) 

C;Date: 23-Mar-1995 #sequence_revision 05-Apr-1995 #text_change 02-Feb-2001 
C; Accession: B54774 

R;Luciani, M.F.; Denizot, F. ; Savary, S.; Mattel, M.G.; Chimini, G. 
Genomics 21, 150-159, 1994 

A; Title: Cloning of two novel ABC transporters mapping on human chromosome 9. 
A;Reference number: A54774; MUID : 94375008 ; PMID:8088782 
A;Accession: B54774 
A; Molecule type: mRNA 
A; Residues: 1-1472 <LUC> 

A; Cross-references: GB:X75927; NID:g495258; PIDN : CAA53531 . 1 ; PID:g495259 
C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

C; Keywords: ATP; nucleotide binding; P-loop 

F; 44-234/Domain: ATP-binding cassette homology <ABCl> 

F; 61-68/Region: nucleotide-binding motif A (P-loop) 

F; 1108-1300/ Domain : ATP-binding cassette homology <ABC2> 

F; 1126-1133/Region: nucleotide-binding motif A (P-loop) 

Query Match 56.2%; Score 7119; DB 2; Length 1472; 

Best Local Similarity 94.2%; Pred. No. 0; 

Matches 138 8; Conservative 22; Mismatches 60; Indels 4; Gaps 4; 



Qy 


965 


QACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFL 

1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 : I 1 1 1 1 1 1 1 1 1 > 1 1 1 1 1 1 1 1 1 1 1 

QACAMESRHFEETRGMEEEPTHLPLWCVDKLTKVYKNDKKLALNKLSLNLYENQWSFL 


1024 


Db 


1 


60 


Qy 


1025 


GHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEE 

1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GHNGAGKTTTMS I LTGLFP PT SGSAT I YGHDI RTEMDEI RKNLGMC PQHNVLFDRLTVEE 


1084 


Db 


61 


120 


Qy 


1085 


HLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRA 

I | | 1 1 1 II 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

HLWFYSRLKSMAQEEIRKETDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRA 


1144 


Db 


121 


180 


Qy 

Db 


1145 
181 


1 1 LDEPTAGVDP YARRAIWDLI LKYKPGRT I LLSTHHMDEADLLGDRI AI I SHGKLKCCG 

I I I 1 1 1 1 II 1 1 II 1 1 1 I 1 1 1 1 1 M 1 1 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 

I I LDEPTAGVDP YARRAIWDLI LKYKPGRT I LLSTHHMDEADLLGDRI All SHGKLKCCG 


1204 
240 


Qy 

Db 


1205 
241 


SPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVAS 

1 1 1 1 1 1 1 1111111111:11111 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 

SPLFLKGAYXDGYRLTLVKQPAEPGTSQEPGLASSPSGCPRLSSCSEPQVSQFIRKHVAS 


1264 
300 


Qy 

Db 


1265 
301 


CLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSE 

I 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M.I II 
SLLVSDTSTELSYILPSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFLKVSE 


1324 
360 



Qy 
Db 



1325 
361 



1384 



420 



Qy 


1385 


Db 


421 


Qy 


1445 


Db 


481 


Qy 


1505 


Db 


541 


Qy 


1565 


Db 


601 


Qy 


1624 


Db 


661 


Qy 


1684 


Db 


721 


Qy 


1744 


Db 


781 


Qy 


1804 


Db 


841 


Qy 


1864 


Db 


901 


Qy 


1924 


Db 


961 


Qy 


1984 


Db 


1021 


Qy 


2044 


Db 


1081 


Qy 


2103 


Db 


1141 


Qy 


2163 


Db 


1201 


Qy 


2223 



DEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLL 1444 

: | | | | : | I I I I I I I I I I I I I I I I I I I I I I I I :: I I I I I I I I : I I I I : I I I i I I I 

EEGTGYSDGYGDYRPLFDNLQDPDNVSLQEAEMEALAQVGQGSRKLEGWWLKMRQFHGLL 4 80 

VKRFHCARRNSKALFSQI LLPAFFVCVAMTVALSVPEI GDLPPLVLS PSQYHN YTQPRGN 1504 

I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

VKRFHCARRNSKALCSQI LLPAFFVCVAMTVALSVPEI GDLPPLVLS PSQYHNYTQPRGN 540 

FIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGES 1564 

| | I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I 

FIPYANEERQEYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPMLNLSSGES 600 

RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDED-LQAWNVSLPPTAGPE 1623 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I 

RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPVXPDEDSLQAWNMSLPPTAGPE 660 

MWT SAP S LP RLVRE P VRCTC S AQ GTGFSCPSS VGGH P P QMRWT GD I LT D I T GHNVS E YL 1683 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TWTSAPSLPRLVHEPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYL 720 

LFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTY 174 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

LFTSDRFRLHRYGAITFGNVQKSI PASFGARVPPMVRKIAVRRVAQVLYNNKGYHSMPTY 78 0 

LNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDVVIAIFIIVAM 1803 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

LNSLNNAILRANLPKSKGNPAAYXITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAM 840 



S FVPAS FWFLVAEKSTKAKHLQFVS GCNPI I YWLAN YVWDMLNYLVPATCCVI I LFVFD 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I II I I I I I I I I 

S FVPAS FWFLVAEKSTKAKHLQFVS GCN PVI YWLAN YVWDMLNYLVPATCCVI I LFVFD 



1863 



900 



L P AYT S P TN F P AVL SLFLLYGWSITPI MY PAS FW FEVP S S AYVFL I VI N L F I G I TATVAT 1923 

M I I I I II I I I I I I I II I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

L P AYT S P TN F PAVL S L FLL YGW S I T P I MY PAS FW FEVP S S AYVFL I VI N L FI G I TATVAT 960 

FLLQLFEHDKDLKWNS YLKSCFLI FPNYNLGHGLMEMAYNEYINEYYAKI GQFDKMKS P 1983 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
FLLQLFEHDKDLKWNS YLKSCFLI FPNYNLGHGLMEMAYNEYINEYYAKI GQFDKMKSP 102 0 

FEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLR 2043 

I 1 1 1 I 1 I I ! I I I 1 II II llllll 1:1 Ihlll I II I Ml II I II Ml I I M 

FEWDIVTRGLVAMTVEGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQRVLR 108 0 

GDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGV- RPGEC FGLLGVNGAGKT ST FKMLT 2102 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GD7VDNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVCVPGECFGLLGVNGAGKTSTFKMLT 114 0 

GDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGI SWKD 2162 

I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I Ml 
GDESTTGGEAFVNGHSVLKDLLQVQQSLGYCPQFDVPVDELTAREHLQLYTRLRCIPWKD 1200 

EARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKAR 2222 

I | : | | | | | | | | | | | | I I | I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I 
EAQWKWALEKLELTKYADKPAGT YSGGNKRKLSTAI ALI GYPAFI FLDEPTTGMDPKAR 1260 

RFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYM 2282 



Db 



1261 



I | I I || | I I I II I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I M I I I I I I I I I I I I I 

RFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLHCLGSIQHLKNRFGDGYM 132 0 



Qy 


2283 


Db 


1321 


Qy 


2343 


Db 


1381 


Qy 


2403 


Db 


1439 


RESULT 


3 


A54774 





ITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVL 2342 

I | I I I I I I I : I I I I I I I I I I I I I I I : : I I I I I I I I I I I I I I I II I I I I I I I I I 

ITVRTKSSQNVKDWRFFNRNFPEAHAQGKTPYKVQYQLKSEHISLAQVFSKMEQWGVL 1380 

GIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRA 2402 
I | | | | | | | | II I I I I I I I I I I I I I I I : I I I I I I I : I I I I I I I I I II I I I I I I II I 



LVADEPEDLDTEDEGLI S FEEERAQLS FNTDTLC 2436 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
LVADEPEDLDTEDEGLI S FEEERAQLS FNTDTLC 1472 



ATP binding cassette transporter ABC1 - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 05-Apr-1995 #sequence_revision 05-Apr-1995 #text_change 02-Feb-2001 
C; Accession: A54774 

R;Luciani, M.F.; Denizbt, F. ; Savary, S.; Mattel, M.G.; Chimini, G. 
Genomics 21, 150-159, 1994 

A; Title: Cloning of two novel ABC transporters mapping on human chromosome 9. 
A; Reference number: A54774; MUID : 94375008 ; PMID: 8088782 
A; Accession: A54774 
A; Molecule type: mRNA 
A; Residues: 1-2201 <LUC> 

A;Cross-references: GB:X75926; NID:g495256; PIDN : CAA53530 . 1 ; PID:g495257 
C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

C; Keywords: ATP; duplication; nucleotide binding; P-loop 
F; 8 5 6- 104 7 /Domain : ATP-binding cassette homology <ABC1> 
F; 873-880/Region: nucleotide-binding motif A (P-loop) 
F;1869-2060/Domain: ATP-binding cassette homology <ABC2> 
F; 18 8 6-1 8 93/ Region : nucleotide-binding motif A (P-loop) 

Query Match 32.4%; Score 4103; DB 2; Length 2201; 

Best Local Similarity 40.9%; Pred. No. 2.1e-263; 

Matches 943; Conservative 323; Mismatches 626; Indels 414; Gaps 58; 



Qy 


244 


Db 


95 


Qy 


289 


Db 


144 


Qy 


348 


Db 


184 


Qy 


397 


Db 


235 



|| I |::| I II INI: ::hl I 

qriFT.nHNT.^T.PRSTVnSTJ.OXNVGLOKVFLOGYOLHLASLCNGS KLEEI 143 



|||:|||| I ::| I |::||: ::|:| : I 

QLGDAEVSALCGL PRKKLDA AERVLRYNMDI LKPWTKL 183 

>QGACT GRT PGP PAS GA GGAAN GT GAGAVMG PNATAEEGAP SAAALAT P 396 

|| : I III I : : I I : 
rs TSHLPTQHI^AEATTVXLDSLGGLAQELFSTKSWSDMRQEVMFLTNVNSS 234 

)TLQGQCSAFVQLWAGLQPILCGNNRTIEPE t ALRRGN 433 

| : I : : : I : I I : II I I I I 
SSSTOIYOAVSRIVCGH PEGGGLKIKSLNWYEDNNYKALFGGNNTEED 282 



Qy 


434 


MSSLGFTSKEQRNLGLLVHLMTSNP K I L YAPAG S EVD RV 

| : : : : 1 I I : 1 Mill : 1 
VDTFYDNSTTPYCNDLMKNL ES S PLSRI IWKALKPLLVGKI LYTPDT PATRQV 


472 


Db 


283 


335 


Qy 


473 


I LKANETFAFVGNVTHYAQVWLNI SAEIRSFLEQGRLQQHLRWL 
: : 1 : 1 1 : : 1 : 1 : 1 : 1 : 1 : : 1 1 

MAEVNKT FQELAVFHDLEGMWEELSPQIWTFMENSQEMDLVRTLLDSRGNDQFWEQKLDG 


516 


Db 


336 


395 


Qy 


517 


QQYVAELRLHPEAL NLSLDELPPALRQDNFSLPSGMALLQQLDTIDNAACGW 

1 : 1 1 : 1 1 : 1 1 : 1 : 1 1 : 1 1 


568 


Db 


396 


LDWTAQDIMAFLAKNPEDVQSPNGSVYTWREAFNETN QAIQTIS 


439 


Qy 


569 


IQFMSKVSVDIFKGFPDEES'IVNYTLNQAYQDNVTVFASVIFQ — TRKDGSLPPHVHYKI 
: || | : : : : I I : : I : : I : 1 : : 1 1 1 1 1 1 1 1 1 
-RFMECVNLNKLEPIPTEVRLINKSME— LLDERKFWAGIVFTGITPDSVELPHHVKYKI 


626 


Db 


440 


496 


Qy 


627 


RQNS S FTEKTNEI RRAYWRPGPNTG GRFYFLYGFVWIQDMMERAI IDTFVGHDWEP 

1 : 1 : 1 1 : 1 : 1 1 1 1 1 1 1 1 : : 1 1 : : 1 : 1 1 1 1 : : 
RMDIDNVERTNKIKDGYWDPGPRADPFEDMRYVWGGFAYLQDWEQAIIRVLTGSE-KKT 


683 


Db 


497 


555 


^Qy 

Db 


684 
556 


GSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWWSVAMTIQHIVAEKEHRLKEVMKTMG 

1 1 1 1 1 1 1 II INI: 1 1 1 1 :: 1 : 1 1 1 1 : 1 : 1 1 1 II 1 1 1 1 1 : 1 1 

GVTVQQMPYPCYVDDIFLRVMSRSMPLFMTLAWIYSVAVIIKSIVYEKEARLKETMRIMG 


743 
615 


Qy 


• 744 


LNN AVHWVAW F I T G FVQ LSI S VT ALT AI LK YGQVLMH S HWI I WL FLAVYAVAT I MFC FL 
| : | : I : I I : : : I : 1 1 1 1 1 1 : 1 : 1 : : : : 1 1 : 1 : 1 : 1 1 : 1 1 1 
LDNGI LWFSWFVS S LI PLLVSAGLLWI LKLGNLLP YS DP S WFVFLS VFAMVT I LQCFL 


803 


Db 


616 


675 


Qy 


804 


VS VL YS KAKLAS ACGGI I YFLS YVP YMYVAI REEVAHDKI TAFE- KC IAS LMSTTAFGLG 

: 1 1 : 1 : 1 1 1 : 1 1 1 1 1 1 1 1 1 : 1 1 : II 1 1 1 1 1 : 1 1 1 1 1 
ISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQDYVGFSIKI FASLLSPVAFGFG 


862 


Db 


676 


730 


Qy 


863 


SKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAWMLM^ 

: 1 1 1 1 : 1 I : I : 1 I : 1 1 1 1 1 1 1 1 1 1 : 1 :: 1 : 1 1 : : 1 1 1 1 1 1 1 1 1 1 
CEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTAVSMMLFDTFLYGVMTWYIEAVFPGQY 


922 


Db 


731 


790 


Qy 


923 


GLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRRFEETRGMEE 

1:111111111111 1 1 1 1 : : 1 : 1 1 1 1 
GI PRPWYFPCTKS YWFGE EIDEKSHPGSSQKGVS EIC MEE 


982 


Db 


791 


830 


Qy 


983 


EPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLF 

1 1 M 1 1 1 : 1 1 1 1 : 1 1 : 1 : : MM II 1 : 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 
EPTHLRLGVS IQNLVKVYRDGMWAVDGLALNFYEGQITS FLGHNGAGKTTTMS I LTGLF 


1042 


Db 


831 


890 


Qy 


1043 


PPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRR 

1 1 1 1 1 : 1 1 1 llhll 1 1 : 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 : 1 1 1 : : : : - 
P PT S GTAYI LGKDI RS EMS S I RQNLGVCPQHNVLFDMLTVEEHI WFYARLKGLS EKHVKA 


1102 


Db 


891 


950 


Ov 


1103 


EMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRA 

II : : I 1 : 1 : I 1 1 1 1 1 1 : 1 1 1 1 1 1 : 1 1 1 1 1 1 : : 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 

EMEQMALDVGLPPSKLKSKTSQLSGGMQRKLSVALAFVGGSKWILDEPTAGVDPYSRRG 


1161 


Db 


951 


1010 


Qy 


1162 


IWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTL 

1 I : I : I 1 1 : 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II Ml 

IWELLLKYRQGRTI I LSTHHMDEADI LGDRI AI I SHGKLCCVGS S LFLKNQLGTGYYLTL 


1221 


Db 


1011 


1070 



1222 VKRPAEPG " GPQEPGLASSPPGRAPLSSCSELQVSQFIR 1259 

|| : | : I I I I : I I I 

1071 VKKDVES S LS SCRNSS STVSCLKKEDSVSQS S SDAGLGSDHESDTLTI DVS AI SNLI R 1128 

1260 KHVASCLLVS DTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVF 1319 

III: Ml I I : I : I I I I I I : I I I II : : I I M I M : M I II I M 
1129 KHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDDRLSDLGISSYGISETTLEEIF 1188 

1320 LKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSV 1379 

111:11 I I I : I I 

1189 LKVAEE SGVDA- ETSDGTLP 1207 

1380 GSARGDEGAGYTDVYGDYRPLF DNPQDPD — NVSLQEVEAEALSRV-GQGSRKLD 1431 

||: | : | | : I : I I : : : : I : I I : I : I I : I 

1208 — ARRNRRA FGDKQSCLHPFTEDDAVDPNDSDIDPESRETDLLSGMDGKGSYQLK 1260 

1432 GGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLS 1491 

| | : | | | I I I I I I : I I : I I : I I I I I I : I : MM I III 

1261 GWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALVFSLIVPPFGKYPSLELQ 132 0 

14 92 PSQYH-NYTQPRGNFIP YAN EERREYRLRLSP DAS P Q Q LVS T FRL P S GVG AT C VL K S PAN 1550 
| |: || : : : I I hi— I I I : : I 

1321 PWMYNEQYT FVSNDAPE DMGT Q E L LN AL TKDPGFGT RCME GN P I P 1365 

1551 GSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQ 1610 

I: II : Ml 

1366 DTPCL— AGEED— 1375 

1611 AWNVSLPPTAGPEM WTSAPSLPRLVREPVRCTCSAQGTGFS CPSSVGG-HPP 1661 

| : I | : : : II I Ml: II MM 

1376 -WTISPVPQSIVDLFQNGNWTMKNPSP ACQCSSDKIKKMLPVCPPGAGGLPPP 1427 

1662 QMRWTGDILTDITGHNVSEYLLFTSDRF RLHRYGAITFG 1701 

I : I Ml : M I I : I : I I : I : ■ Ml : I 

1428 QRKQKTADILQNLTGRNISDYLVKTYVQIIAKSLKNKIWVNEFRYGGFSLGVSNSQALPP 1487 

170 2 NVLKS I PAS FGTRAP PMVRKI A VRRAAQVFYNNKGYHSMPTYLN 1745 

: | : : | | : : : : I : : I I I I M : : : M I 

1488 SHEVNDAI KQMKKLLKLTKDTSADRFLSSLGRFl^GLDTKNNVKVWFNNKGWHAISSFLN 1547 

1746 SLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLS-LDYLLQGTDWIAIFIIVAMS 1804 

:| I I I M I I I I : II : I M I IMM I II : ' : ll::M M III 
1548 VINNAILRANLQKGE-NPSQYGITAFNHPLNLTKQQLSEVALMTTSVDVLVSICVIF7\MS 1606 

1805 FVPASFWFLVAEKSTK7VKHLQFVSGCNPIIYWI^YVWDMLNYLVPATCCVIILFVFDL 1864 

I I I I I I I I I I : I : : II Mil I M I I M I I I M M I I I II M I I I Ml I 

1607 FVPAS FWFLIQERVSKAKHLQFI SGVKPVI YWLSNFVWDMCNYWPATLVI 1 1 FI CFQQ 1666 

1865 PAYT S PTN F P AVL S L FLL YGW S I T P I MY PAS FW FEVP S S AYVFLI VI NL FI G I TAT VAT F 1924 

: I I || I : I II I M M II M II M I I : M I M II I M I II II Mill 
1667 KSYVSSTNLPVLALLLLLYGWSITPLMYPAS FVFKIPSTAYWLTSVNLFIGINGSVATF 1726 

1925 LLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPF 1984 

: | : | | : : I I : I II I I II II : : II I I : M I : : : : I : : : II 
1727 VLELFTNNK-LNDINDILKSVFLIFPHFCLGRGLIDMVKNQAMADALERFGE-NRFVSPL 1784 

1985 EWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVED-DVDVASERQRVLR 2043 



Db 


1785 


SWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRPVKAKLPPLNDEDEDVRRERQRILD 


1844 


Qy 


2044 


GDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTG 

| | | : : : | : | | | : | : : | I 1 1 1 : 1 : 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 - 1 1 1 1 1 1 1 
GGGQNDILEIKELTKIYRRK— RKPAVDRICIGIPPGECFGLLGVNGAGKSTTFKMLTG 


2103 


Db 


1845 


1901 


Qy 


2104 


DESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDE 

I | | : 1 1 : 1 : 1 : 1 : : 1 1 : : 1 1 1 1 1 1 1 : : 1 1 1 1 1 : : : IN* 1 : 
DTPVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAITELLTGREHVEFFALLRGVPEKEV 


2163 


Db 


1902 


1961 


Qy 

Db 


2164 
1962 


ARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARR 

: : | | : I I I ! 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
GKFGEWAI RKLGLVKYGEKYASNYSGGNKRKLSTAMALIGGPPWFLDEPTTGMDPKARR 


2223 
2021 


Qy 

Db 


2224 
2022 


FLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMI 

| I M I :: I | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 

FLWNCALSIVKEGRSWLTSHSMEECEALCTRMAIMVNGRFRCLGSVQHLKNRFGDGYTI 


2283 
2081 


yy 




T VR- T KS S Q S VKD WRFFN RN FP EAMLKERHHT KVQ YQLKS EH I S LAQVFS KMEQVS GVL 

II I : : I 1 II 1 1 : : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 : : 1 1 : 1 1 

WRIAGSNPDLKPVQEFFGLAFPGSVLKEKHRNMLQYQLPSSLSSLARIFSILSQSKKRL 


2342 


Db 


2082 


2141 


Qy 


2343 


GIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 III: 
HIEDYSVSQTTLDQVFVNFAKDQSDD 2167 




Db 


2142 





RESULT 4 
S71363 

probable ATP-binding cassette transporter ABC-3 - human 
N;Alternate names: ATP-binding cassette transporter ABC-C 
C; Species: Homo sapiens (man) 

C;Date: 29-Jan-1998 #sequence_revision 06-Feb-1998 #text_change 02-Feb-2001 
C;Accession: S71363 
R;Klugbauer, N. ; Hofmann, F. 
FEBS Lett. 391, 61-65, 1996 

A; Title: Primary structure of a novel ABC transporter with a chromosomal 
localization on the band encoding the multidrug resistance-associated protein. 
A; Reference number: S71363; MUID : 96326608 ; PMID: 8706931 
A;Accession: S71363 

A; Status: nucleic acid sequence not shown 
A; Molecule type: mRNA 
A; Residues: 1-1704 <KLU> 

A; Cross-references: EMBL:X97187; NID : gl514529 ; PIDN : CAA65825 . 1 ; PID:e243436; 
PID:gl514530 

A; Experimental source: cell line medullary thyroid carcinoma 

C; Genetics : 

A; Gene: GDB: ABC3 

A; Cross-references: GDB: 3770735; OMIM: 601615 
A; Map position: 16pl3 . 3-16pl3 . 3 

C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

C; Keywords: ATP binding; nucleotide binding; P-loop; phosphoprotein; 
transmembrane protein 

F;255-283/Domain: transmembrane #status predicted <TM1> 
F;307-329/Domain: transmembrane #status predicted <TM2> 



F;345-364/Dornain: transmembrane #status predicted <TM3> 
F; 373-394 /Domain : transmembrane #status predicted <TM4> 
F;401-422/Domain: transmembrane #status predicted <TM5> 
F;452-475/Domain: transmembrane #status predicted <TM6> 
F; 549-739/Domain: ATP-binding cassette homology <ABC1> 
F;566-573/Region: nucleotide-binding motif A (P-loop) 
F; 685-690/Region : nucleotide-binding motif B 
F; 1100-1120/Domain: transmembrane #status predicted <TM7> 
F;1145-1169/Domain: transmembrane ffstatus predicted <TM8> 
F; 118 1-12 07 /Domain : transmembrane #status predicted <TM9> 
F;1215-1236/Domain: transmembrane #status predicted <TM10> 
F;1245-1264/Domain: transmembrane ((status predicted <TM11> 
F;1299-1324/Domain: transmembrane ((status predicted <TM12> 
F; 1399-1590/Domain: ATP-binding cassette homology <ABC2> 
F; 14 16- 14 23/ Region : nucleotide-binding motif A (P-loop) 
F; 1535- 1540 /Region : nucleotide-binding motif B 

F;674,866,1524/Binding site: phosphate (Ser) (covalent) (by cAMP-dependent 
kinase) ffstatus predicted 

F;1344/Binding site: phosphate (Thr) (covalent) (by cAMP-dependent kinase) 
ffstatus predicted 

Query Match 20.7%; Score 2622; DB 2; Length 1704; 

Best Local Similarity 34.0%; Pred. No. 4.2e-165; 

Matches 638; Conservative 317; Mismatches 556; Indels 364; Gaps 45; 

Qy 581 KGFPDEESIWYTLNQAYQDNV— TVFASVIFQ TRKDGSLPPHVHYKIR 627 

: | | I I : : I II : I I s I : I : I I I I : I 

D b 108 RGFPSEKDFEDY IRYDNCSSSVLAAWFEHPFNHSKEPLPLAVKYHLRFSYTRRNY 163 

Q y 628 — QNSSFTEK TNEIRRAYWRPG PNTGGRFYFLYGFVWIQDMMERAI I 672 

| | | | I : : I I I : I I I I : : I : : I I I : 

Db 164 MWTQTGSFFLKETEGWHTTSLFPLFPNPGPREPTSPDGGEPGYIREGFLAVQHAVDRAIM 223 

Qy 673 DTFVGHDWEPGSY VQMFP YPCYTRDDFLFVI EHMMPLCMVI SWVYSVAMT I QH 726 

: | : :: MM : I II I:: :M I: 

Db 224 EYHA— DAATRQLFQRLTVTIKRFPYPPFIADPFLVAIQYQLPLLLLLSFTYTALTIARA 281 

Qy 727 I VAEKEHRLKE WKTMGLNNAWWVAWFITGFVQLS I SVT ALTAI L KYGQVLMHS 781 

:| I I I I II I I: I I I :: :| I I I M |: I |: : :| : II I 

Db 282 WQEKERRLKEYMRMMGLSSWLHWSAWFLLFFLFLLIAAS FMTLLFCVKVKPNVAVLSRS 341 

Qy 782 HWIIWLFJ^VYAVATIMFCFLVSVXYSKAK^^ 841 

: : || : I : : II I I : I I : I II : I : I I I : I I M : I I : I I I : = 
Db 342 DPSLVLAFLLCFAISTISFSFMVSTFFSKANMAAAFGGFLYFFTYIPYFFVAPR— — YN 397 

Qy 842 KITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVE-GDDFNLLLAVTML 900 

: | : | : | : | I : I : : : I I : I I I I III I I I : I I 

Db 398 WMTLSQKLCSCLLSWAMAMGAQLIGKFEAKGMGIQWRDL-LSPVNVDDDFCFGQVLGML 456 

Qy 901 MvT)AVVYGILTWYIEAVTIPGMYGLPRPWYFPLQKSYWLGSGRT 960 

: : I : I : I I : : I I I : I I I II : I : I M I II : Ml I I I = 

Db 457 LLDSVLYGLVTWYMEAVFPGQFGVPQPWYFFIMPSYWCGKPRAVAGK 503 

Qy . 961 MEEDQACAMESRRFEETRGMEEEPTHLPLVVCVT)KLTKVYK — DDKKLALNKLSLNLYEN 1018 

||: : :: I I III I : : hM- : : M ■ MIMM 

Db 504 -EEEDSDPEKALRNEY FEAEPEDLVAGIKIKHLSKVFRYGNKDRAAVRDLNLNLYEG 559 



1019 QWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFD 1078 

|: I I M l I I I I I I : I : I I M I I I I I I I I |::| :l : I II : I I : I I I I = : I I I 
560 QITVLLGHNGAGKTTTLSMLTGLFPPTSGRAYI SGYEI SQDMVQIRKSLGLCPQHDI LFD 619 

107 9 RLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAF" 1138 

| | | | | | : | | :: | | : :::: | : : I : : I : I : I : I I I I I : I I I I : I I 
620 NLTVAEHLYFYAQLKGLSRQKCPEEVKQMLHIIGLEDKWNSRSRFLSGGMRRKLSIGIAL 679 

1139 VGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHG 1198 

: | I : : I I I I I I : I : I : I I I I I I I : • I I I I : I : I I I I I I I I i I I I I I I I 

680 IAGSKVLILDEPTSGMDAISRRAIWDLLQRQKSDRTIVLTTHFMDEADLLGDRIAIMAKG 739 

1199 KLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFI 1258 

: I : I I I I I I I I I I I I : I I I I I I : : N : 

740 ELQCCGSSLFLKQKYGAGYHMTLVKEP HCNPEDISQLV 777 

1259 RKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEV 1318 

|| : || llhllll: : II II 11= I —III I I - I I I 

778 HHHVPNATLESSAGAELS FI LPRESTHR — FEGLFAKLEKKQKELGI AS FGAS ITTMEEV 835 

1319 FLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASS 1378 
||:| : | :| |:: : II : I I : I I I 
836 FLRVGK LVD S SMD I QAI Q LPALQ — YQHERRASDWAVDSNL 874 

1379 VGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGW-LKV 1437 
| : : | | : I I I I : I I : I I 

875 CGAMDPSDGIG ALIEEERTAV KLNTGLALHC 905 

1438 RQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHN 1497 

: | | : : | : : | I : : I : I : I I : I : hill h : I 

906 QQTOAMFLKKAAYSWREWKMVAAQVLVPLTCWLALLAINYSSELFDDPMLRLTLGEY-- 963 

1498 YTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCV-LKSPANGSLGPT 1556 

III I II 

964 GRTWPFSVPGTSQLGQQ 981 



1557 LNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSL 1616 
I : 

982 LS 



983 



1617 PPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITG 1676 

I : : | : I : I I I I : 
984 EHLKDALQAEG QEPREVLGDL 1004 

1677 HNVS EYLLFT S DRFRLHRYGAI TFG — NVLKS I PAS FGTRAPPMVRKI AVRRAAQVFYNN 1734 

|:|:| : :: I I ' : III I : I : I I 

1005 EEFLIFRA-- SVEGGGFNERCLVAASF RDVGERTWNALFNN 1044 

1735 KGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQG-TDV 1793 

: Ml | I ::| : I | | | |:| | ::: : I :| 

1045 QAYH S PAT ALAWDN L L F KLLCGPHA-SIWSNFPQPRSALQAAKDQFNEGRKGF 1098 

1794 VIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLANYWDMLNYLVP 1853 

| | : : : I I : I : : : I : I : I : : : I I I : I I I I I : : I I : : I I : : : : I : I = 
1099 DI ALNLLFAMAFLASTFSI LAVSERAVQAKHVQFVSGVHVAS FWLSALLWDLI S FLI PSL 1158 



1854 CCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINL 1913 



: : : | I : I • I : | i i i i i i : i i • i i • i • i • • i i i • ■ • 

1159 LLLVVFKAFDVT^FTRDGHMADTLLLLLLYGWAI I PLMYLMNFFFLGAATAYTRLTI FNI 1218 

1914 FIGITATVATFLLQLFEHDKDLKV — VNSYLKSCFLIFPNYNLGHGLMEMAYNEY 1966 



|| MM: : | : : : I ||: M: M : II 

Db 1219 LSGI ATFLMVTIMRIPAVKLEELSKTLDHVFLVLPNHCLGMAVSSF--YENYETRRY 1273 

Qy 1967 INEYYAKIGQFDKMKSPFEWDI— VTRGLVAMAVEGWGFLLTIMCQYNFLRRPQ 2019 

: : | | : : : I . I I : : M I : I : : I I : I : 

Db 1274 CTS S EVAAHYCKKYNI QYQENFYAWS APGVGRFVASMAAS GCAYLI LLFLI ETNLLQRLR 1333 

Qy 2020 RM P VS T K P VE D DVD VAS E RQ RVL RGD ADN DM VKIENLTKV 2059 

MM : : I I M I I I : I I : : : I : I : I I 

Db 1334 GILCALRRRRTLTELYTRMPV LPEDQDVADERTRILAPSPDSLLHTPLIIKELSKV 1389 

Qy 2060 YKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSV 2119 

|:| : I I M II I I : M I I II I I II I I II : I M I I I : I I I I I : 

Db 1390 YEQRV--PLLAVDRLSLAVQKGECFGLLGFNGAGKTTTFKMLTGEESLTSGDAFVGGHRI 1447 

Qy 2120 LKELLQVQQSLGYCPQCD7VLFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKY 2179 

: : : | : I : M M I M I I : I I I I : I I I I I I = I : I I I : 

Db 1448 SSDVGKVRQRIGYCPQFDALLDHMTGREMLVMYARLRGIPERHIGACVENTLRGLLLEPH 1507 

Qy 2180 ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSV 2239 

| : | I I I I I I I E I! I I I II I I I I II II II : M M I III II:: : : I : :: 
Db 1508 T^IKLVRTYSGGNKRKLSTGIALIGEPAVIFLDEPSTGMDPVARRLLWDTVARARESGKAI 1567 

Qy 2240 VLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS SQSVKDV 2296 

: : M M I I I M II I I I I II I I I : : II I I I II I : : I I I I : : : I = ' ' : : 

Db 1568 IITSHSMEECEALCTRLAIMVQGQFKCLGSPQHLKSKFGSGYSLR7VKVQSEGQQEALEEF 1627 

Qy 2297 VRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDN 2356 

| : | | : : | : : | I I I : I I : I I s I • I : : I I I I I I : I : 

Db 1628 KAFVDLTFPGSVLEDEHQGMVHYHLPGRDLSWAKVFGILEKAKEKYGVDDYSVSQISLEQ 1687 

Qy 2357 VFVNFAKKQSDNLEQ 2371 



i i • • l i i i • 

1688 VFLSFAHLQPPTAEE 1702 



RESULT 5 
A59188 

ATP-binding cassette transporter ABC3 - human 
C; Species: Homo sapiens (man) 

C;Date: 18-Feb-2000 #sequence_revision 18~Feb-2000 #text_change 17-May-2002 
C;Accession: A59188 

R;Connors, T.D.; van Raay, T.J.; Petry, L.R.; Klinger, K.W. ; Landes, G.M.; Burn, 
T.C. 

Genomics 39, 231-234, 1997 

A;Title: The cloning of a human ABC gene (ABC3) mapping to chromosome 16pl3.3. 
A; Reference number: A59188; MUID: 97179225; PMID: 9027511 
A;Accession: A59188 

A; Status: preliminary; not compared with conceptual translation 
A; Molecule type: mRNA 
A; Residues: 1-1704 <CON> 

A;Cross-references: GB:U78735; NID : gl699037 ; PIDN : AAC50967 . 1 ; PID:gl699038 
C; Genetics: 



A; Gene: GDB : ABC3 

A;Cross-references: GDB: 3770735; OMIM: 601615 
A; Map position: 16pl3 . 3-16pl3 . 3 

C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

Query Match 20.7%; Score 2622; DB 2; Length 1704; 

Best Local Similarity 34.0%; Pred. No. 4.2e-165; 

Matches 638; Conservative 317; Mismatches 556; Indels 364; Gaps 45; 

Qy 581 KGFPDEES I VNYTLNQAYQDNV — TVFASVI FQ TRKDGSLPPHVHYKIR 627 

: | | | | : : I II : I I : I : I : II I I : I 

D b 108 RGFPSEKDFEDY IRYDNCSSSVIAAWFEHPFNHSKEPLPLAVKYHLRFSYTRRNY 163 

Qy 628 QNSSFTEK TNEIRRAYWRPG PNTGGRFYFLYGFVWIQDMMERAII 672 

| | | I I : : I I I : I I I I : : I : : I I I : 

Db 164 MWTQTGSFFLKETEGWHTTSLFPLFPNPGPRELTSPDGGEPGYIREGFLAVQHAVDRAIM 223 

Qy 673 DTFVGHDWEPGSY VQMFPYPCYTRDDFLFVIEHMMPLCMVT SWVYSVAMTIQH 726 

: | : ' : : I I I I : III | : : : I I : : : I : I : : 

Db 224 EYHA — DAATRQLFQRLTVTIKRFPYPPFIADPFLVAIQYQLPLLLLLSFTYTALTIARA 281 

Qy 727 I VAE KEH RL KEVMKTMG LNNAVHWVAW F I T G FVQ LSI S VT ALT AI L KYGQVLMHS 781 

: | | | | I I I I I : I I I : : : I I I I I : I : I I : : : I : ' II I 

Db 282 WQEKERRLKEYMRMMGLSSWLHWSAWFLLFFLFLLIAASFMTLLFCVKVKPNVAVLSRS 341 

Qy 782 HWIIWLFIJ\WAVATIMFCFLVSVXYSKA^ 841 

: : II : I : : I I I I : I I : I I I : I : I I I : I I : I : i I : I I I • '• 

D b 342 DPSLVLAFLLCFAISTISFSFMVSTFFSKANMA7VAFGGFLYFFTYIPYFFVAPR YN 397 

Qy 842 KITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVE-GDDFNLLLAVTML 900 

:| :| : 1:1 I :|:: . :| hi Ml Ml Ml : M 

Db 398 WMTLSQKLCSCLLSNVAMAMGAQLIGKFEAKGMGIQWRDL-LSPVNVDDDFCFGQVLGML 456 

Qy 901 MVDAWYGI LTW Y I EAVH P GMYGL P RPW Y F P LQ K S YW LG S GRT EAWEW S W PWART P RL S V 960 

: : | : |: M : : II h M I I I M : I M M I : I I I I I I : f 
Db 457 LLDSVLYGLVTWYMEAVFPGQFGVPQPWYFFIMPSYWCGKPRAVAGK 503 

Qy 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYK— DDKKLALNKLSLNLYEN 1018 

||:: : : I I III I : : h I I : : : : h h II I II 

Db 504 -EEEDSDPEKALRNEY FEAEPEDLVAGIKIKHLSKVFRVGNKDRAAVRDLNLNLYEG 559 

Qy 1019 QWS FLGHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMC PQHNVLFD 1078 

I : | II II II I I I I : I : II I I II I I II I I MM M : I I h I h I I I h : I I I 
Db 560 QITVLLGHNGAGKTTTLSMLTGLFPPTSGRAYISGYEISQDMVQIRKSLGLCPQHDILFD 619 

Qy 1079 RLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAF 1138 

Ml | | | : | | : : | | : : :: : | : M : : I . : I : I : -I I I I h I I I h I I 
Db 620 NLTVAEHLYFYAQLKGLSRQKCPEEVKQMLHIIGLEDKWNSRSRFLSGGMRRKLSIGIAL 679 

Qy 1139 VGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHG 1198 

: II: :M III IM :l : I I I I I I h : I I I h h I I I I I I I I I II I I I h : I 
Db 680 IAGSKVLILDEPTSGMDAISRRAIWDLLQRQKSDRTIVLTTHFMDEADLLGDRIAIMAKG 739 

Qy 1199 KLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFI 1258 

: I : I I I I MM II II Mill I I : Ml: 

Db 740 ELQCCGSSLFLKQKYGAGYHMTLVKEP HCNPEDISQLV 777 



Qy 1259 RKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEV 1318 

II: II I I I : I I I I : : I I I I I I : I : : I I I I I M I I 

Db 778 HHHVPNATLES SAGAELS FI LPRESTHR — FEGLFAKLEKKQKELGI AS FGAS ITTMEEV 835 

Qy 1319 FLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASS 1378 

I I : I : | : | | : : : I I : I I : I I I 
Db 836 FLRVGK LVDSSMDIQAIQ LPALQ— YQHERRASDWAVDSNL 874 

Qy 137 9 VGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGW-LKV 1437 

| : : | | : I I I I : I I : I I 

Db 875 CGAMDPSDGIG ALIEEERTAV KLNTGLALHC 905 

Qy 1438 RQ FH GL LVK RFH C ARRN S KAL FS Q I LL PAFFVCVAMTVAL S VP E I GD L P P LVL S P S Q YHN 1497 

: I I : : I : : I I : : I : I : I I : I : I : I I I I : : I 

Db 906 QQFWAMFLKKAAYSWREWK1WAAQVLVPLTCWLALLAINYSSELFDDPMLRLTLGEY-- 963 

Qy 1498 YTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCV-LKSPANGSLGPT 1556 

III I M 

Db 964 GRTWPFSVPGTSQLGQQ 981 

Qy 1557 LNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSL 1616 

I : 

Db 982 LS 983 

Qy 1617 PPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITG 1676 

I : : |:| : I I I I : 
Db 984 EHLKDALQAEG QEPREVLGDL 1004 

Qy 1677 HNVSEYLLFTSDRFRLHRYGAITFG--NVLKSIPASFGTRAPPMVRKIAVRR7KAQVFYNN 1734 

|:|:| : :: I I : III I : I =11 

Db 1005 EEFLIFRA SVEGGGFNERCLVAAS F RDVGERTWNALFNN 1044 

Qy 1735 KGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQG-TDV 1793 

: | I I || : : I : I II I hi I : = : =1 : I 

Db 1045 Q AYH S P AT ALAWDN L L F KLLCGPHA- S I WSNFPQPRSALQAAKDQFNEGRKGF 1098 

Qy 1794 VI AI F 1 1 VAMS FVP AS FWFLVAEK S T KAKH LQ FVS GCN P 1 1 YW LAN YVW DMLN YL VP AT 1853 

| | : : : | | : | : : : I : I : I : : : I I I : I I I I I : : I I : : I I : : : : I : I : 
Db 1099 D I ALN LL FAMAFLAS T FS I LAVS ERAVQAKHVQ FVS GVH VAS FWL S AL LWDL I S FL I P S L 1158 

Qy 1854 CCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINL 1913 

: : : I I : I : I : I I I I I I I : I I : I I : I : I -11 I : I : 
Db 1159 LLLWFKAFDVRAFTRDGH>1ADTLLLLLLYGWAIIPLMYLMNFFFLGAATAYTRLTIFNI 1218 

Qy 1914 FIGITATVATFLLQLFEHDKDLKV— VNSYLKSCFLIFPNYNLGHGLMEMAYNEY 1966 

|| MM: : | : : : I I I : I I : II : I I 

Db 1219 LSGI ATFLMVTIMRIPAVKLEELSKTLDHVFLVLPNHCLGMAVSSF-YENYETRRY 1273 

Qy 1967 INEYYAKIGQFDKMKSPFEWDI — VTRGLVAMAVEGWGFLLTIMCQYNFLRRPQ 2019 

: : I I : : : I I I : : I I I M : : I I M : 

Db 1274 CTSSEVAAHYCKKYNIQYQENFYAWSAPGVGRFVASMAASGCAYLILLFLIETNLLQRLR 1333 

Qy 2020 RMPVSTKPVEDDVDVASERQRVLRGDADNDM — -VKIENLTKV 2059 

MM : : I II I I I I M I : : - I = 1 = 11 
Db 1334 GI LCALRRRRTLTELYTRMPV LPEDQDVADERTRILAPSPDSLLHTPLIIKELSKV 1389 



Qy 


2060 


YKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSV 

1 : 1 : 1! 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 : 1 1 1 hill M : 
YEQRV — PLLAVDRLSLAVQKGECFGLLGFNGAGKTTTFKMLTGEESLTSGDAFVGGHRI 


2119 


Db 


1390 


1447 


Qy 


2120 


LKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKY 

:: :|:| :||||| III 1 :| II 1 :| Mill : h 1 II : 

SSDVGKVRQRIGYCPQFDALLDHMTGREMLVMYARLRGIPERHIGACVENTLRGLLLEPH 


2179 


Db 


1448 


1507 


Qy 


2180 


ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSV 

I : | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 111111:11111 III II: : : : I :': : 
ANKLVRTYSGGNKRKLSTGIALIGEPAVIFLDEPSTGMDPVARRLLWDTVARARESGKAI 


2239 


Db 


1508 


1567 


Qy 


2240 


VLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS SQSVKDV 

: : I I I 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 : Mill 1 1 1 1 : : 1 1 II : : ' 1 : 

IITSHSMEECEALCTRLAIMVQGQFKCLGSPQHLKSKFGSGYSLRAKVQSEGQQEALEEF 


2296 


Db 


1568 


1627 


Qy 


2297 


VRFFNRN FP EAMLKERHHT KVQYQLKS EHI S LAQVFS KMEQVS GVLGI ED YS VSQTTLDN 

I : M : : 1 : : 1 III : 1 1 : 1 1 ' 1 : 1 : : 1 1 1 M 1 : h 

KAFVDLTFPGSVLEDEHQGMVHYHLPGRDLSWAKVFGILEKAKEKYGVDDYSVSQISLEQ 


2356 


Db 


1628 


1687 


Qy 


2357 


VFVNFAKKQSDNLEQ 2371 
II : : 1 1 1 1 : 




Db 


1688 


VFLSFAHLQPPTAEE 1702 





RESULT 6 
T33783 

hypothetical protein Y39D8C.1 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change 17-Mar-2000 
C;Accession: T33783 

R; Becker, M. ; Graves, T . ; Yoakum, M. 

submitted to the EMBL Data Library, October 1998 

A; Description: The sequence of C. elegans cosmid Y39D8C. 

A; Reference number: Z21408 

A;Accession: T33783 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1802 <BEC> 

A;Cross-references: EMBL: AF101313; PIDN: AAC69223 . 1 ; GSPDB : GN00023 ; CESP : Y39D8C . 1 

A; Experimental source: strain Bristol N2; clone Y39D8C 

C; Genetics: 

A;Gene: CESP : Y39D8C . 1 

A;Map position: 5 

A;Introns: 45/3; 114/1; 195/1; 230/3; 543/3; 794/1; 849/1; 1036/2; 1099/1; 
1132/3; 1165/1; 1322/3; 1458/2; 1560/3; 1656/1 

C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

Query Match 16.0%; Score 2024; DB 2; Length 1802; 

Best Local Similarity 28.6%; Pred. No. 2.8e-125; 

Matches 539; Conservative 319; Mismatches 629; Indels 400; Gaps 48; 

Qy 580 FKGFPDEESIVNYTLNQAYQ — DNVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTN 637 

: I I I I : I : : I II : I : : I I I ' ' ::: ::| : : 

Db 168 YKGFTTEGEMVSWMQGQFQSECDN-PLLAGIVF DDSIAKDLKNPDKRDFTYTIRLS 222 



Qy 638 EIRR AY-W RPGPNTGGR- FYFLYGFVWIQDMMERAI 671 

| : | | I I II | : | | : : | : : I I 

Db 223 NTH RRS RN A FG DN S Y PW DT S VS FAVQ YVS GPINPDDNDGGSP G YWQ E G FMT VQ RAVDVAI 282 

Qy 672 IDTFVGHDV-VEP--GSY-VQMFPYPCYTRDDFLFVIE---HMMPLCMVISWVYSVAMTI 724 

: I I : I III Ihl I: :ll ||: :: |:: II : 

Db 283 TEIITGEDAQLTPLLDSYQVSRFPFPGYSTK IIEIGAFFMPVIVIFSFMTSVIYIV 338 

Qy 725 Q H I VAE K EH RL KE VM KTMG LN N AVH WVAW F I T G FVQ LSI S VT ALT AI L K YGQ VLMH S H W 784 

: : I I I I I I I I I : I I I : : : I I I I I : : I : : I I I : : : I : I : 
Db 339 RAVWEKEDRLKEYMRVMGLSQFINWVAHFIINYAKLTFAVIVLTILMHF — VALKSDMT 396 

Qy 785 IIWLFIAVYAVATIMFCFLVSVLYSKAKIASACGGIIYFLSWPYMWAIREEVAHDKI 844 

: : : : I I : I I : I I : : I : I I : : : I I I : : : I : 

Db 397 LMFVFLMI YAFDWYFAFMI S S FMNSATSATLI SWFWMLLYFWYAFFS SIDQTN 451 

Qy 845 AFE KC I AS LMS TTAFGL G S K Y FAL YEVAGVG I QWHT F S Q S P VEG D D FN LLLAVTMLM 901 

: : I : I I : I I I I : : I I : : I : I : 

Db 452 PYPLGYRLINCINPDIALNYGLQLL7VAYETQADGLKWGELFTPPSPDNNLTFGHALIALI 511 

Qy 902 VDAWYGILTWYIEAVHPGMYGLP-RPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

11:: I I I I I I I I I I I I : I : I I : I I III I 
Db 512 VDGIIMIILTWYIEAVIPGGEGVPQKPWFFVL-PSYWF PNSGS 553 

Qy 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVY KDDKKL 1006 

| : : : : : : I : I I I I : I ' I I I I I : I 

Db 554 KTVDSSDQFQQIQYADHVKLEKEPTDLIPTINWNLTKTYGTSFFKKLFDCKFGKSGEKR 613 

Qy 1007 ALNKLSLNLYENQWS FLGHNGAGKTTTMS I LTGLFP PTS GSAT I YGHDI RTEMDEI RKN 1066 

| : : | : | : I i I I I I I I I I : I I I : I I I : I : I I I I : | I I I : : I I : 
Db 614 AVSNLNLKMYPGQCTVLLGHNGAGKSTTFSMLTGVASPSSGSAYVNDFDIRTSLPKIRRE 673 

Qy 1067 LGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSG 1126 

: I : I I I : I I I : I I I I I I : : : I I : I : : : I : I : I I I 
Db 674 MGLCPQYNTLFGFMTVMEHLEFFAKLKERTWDP — EEAREILARLRIDFKADFMAGALSG 731 

Qy 1127 GMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEAD 1186 

I I I I I I : I II : I I I : : I I I I I : I : I I I I I I I : I I I I I I : I I I : I I I 
Db 732 GQKRKLSLAIALIGGSEWMLDEPTSGMDPGARHETWTLIQREKERRTILLTTHFMEEAD 791 

Qy 1187 LLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPL 1246 

I I I I I I I I :: I I : I : I I I I I : I I I I I I I I I I : I I : I 

Db 792 LLGDRIAIMAHGQLECCGSPMFLKQQYGDGYHLTIVY DTTSTP 834 

Qy 1247 SSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLS 1306 

■ : : | | : : : : | I : I : I I : I : I I : I I : = 

Db 835 DVSKTTDIIREYIPEAHVFSYIGQEATYLL — SATHRPIFPKLFKELEDHQTQCGIT 889 

Qy 1307 S FGLMDTTLEEVFLKVS EEDQS LENS EADVKES RKDVLPGAEGPAS GEGHAGNLARCS EL 1366 

III: Ihlllllll II : : : 
Db 890 S FGVS I TTMEEVFLKVGHTADERYN YEHGI EN D I S EMI 927 

Qy 1367 TQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQG 1426 

: I : I I :: I : 

Db 928 EKDDPI LQDLRAQV 941 



Qy 



1427 SRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLP 1486 



: : | | . - . - M i i - • i I * * 

942 --RVTGFTLQMQHAKAMFYKRAIFFFRKWTQFLPQLVFPVAYLVLMVFTSQVLPSVKEQD 999 

1487 PLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLK 154 6 

|:| | : : : : : : I I I III 
1000 PQTIS-- LAPFSDTKKAGH LVS DSGNYVTL 1027 

1547 SPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPD 1606 
II : III 

1028 LGGSQNLS 1035 

1607 EDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRW 1666 

III 1:1 

X036 ; SMVQGTVTQLGVT 1048 

1667 TGD I LT DI T GHNVS EYLLFT S DRFRLH RYGAI T FG- -NVLKS I PA — S FGTRAP PMVRKI 1722 

; | | | ||:::: : : hill I : I : : I I : I : : 

1049 — QTWDITS-NVEKFIMDQTNAM GSRTFGLHYALGFVPSMFNFSTVSVPSLK 1098 

1723 AVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPM-NKTSAS 1781 

| : | | I :: : : : I I I : I I I I I : I = 

10 99 T FFNN FGL YT PALAI T FT D SMI L SQKQKKQYSFTAVNHPLPPSTQDT 1145 

1782 LSLDYLLQGTDWIAIFIIVAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLANY 1841 

| | : | | : I I : : I : I I : I : I : I I : I : I I I : : I I : 

1146 LKNTNRSDGAAFLIAYGLIVSFAVCVAGYSQFLITERKKKSKHMQLLSGIRPWMFWLTAF 1205 

1842 VWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSL-FLLYGWSITPIMYPASFWFEV 1900 

: | | : : : I I : : I : : III : I : I I I I I I I : I I I • I I 

1206 IWDAAWFVIRILCFDAIFYIFNITAYTHDFGVMLILTLSFLLYGWTALPFTYWFQFFFES 1265 

1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFL — I FPNYNLGHGL 1958 

:::::: I : : : | : : : : III : II I I : 

1266 APKGFMMVTMYHILTGMIGSIAVPII SQTSSLDAGYLWSIIFAWLFPTYNISQIA 1320 

1959 MEMAYNEYINEYYAKIGQFDKMKSPFEWDIVT RGL 1993 

II: I : I : I I : I : 

1321 TVTFQNENVRIACKKLDCTIPM FKAVT ACC GTAS ERL YVDNVL FVGN RKG I LVYV 1375 

1994 VAMAVEGWGFLLTIM CQYNFLRRPQRMPVSTKPVED DVDVASE 2037 

: : I I : I : : : I I : I I : I : I I I I : I 

1376 IFLAVQGFIYWIWVFMRENDQFTKLFALIRCRKADNPIWDITDTDKVDERDVEDSDVIAE 1435 

2038 RQRVLRGDADNDMVKI-ENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTS 2096 

: || : | : I I I I I II : I : I I I I I I I I I I I I I I 

1436 KSWQRLANNNKTALVSNNLVKWY GNFNAVKGVNFHVNSKDCFGLLGVNGAGKTS 1490 

2 097 TFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLR 2156 

I I : I I I I : I : I : I : I I I I I : : Mill I'h h= I I :: Ml 

1491 TFQMLTGENS I S SGDAYVNGWSVKNNWREAGANTGYCPQYDAI I KEMSGEETLYMFARI R 1550 

2157 GISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTG 2216 

|| || : | : : : I I : I I I I I II M I I I I : : I I : I I I I I : I 

1551 GIPEKDIPKKVNAVIHAIGIGMYASRQIKTYSGGNKRRLSLGIAIVGLPDVLLLDEPTSG 1610 



2217 MDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNR 2276 
= 111111 :||:: I I :: | I I I I I I : I I I I I I I Mill M II II IMMI 



Db 



1611 VDPKARRIIWNILNRLRDLGTALVLTSHSMDECEALCTELAIMVYGKFRCYGSCQHIKSR 1670 



C.R. 
L.A. 
M.D. 
G. P. 
CM. 



Qy 2277 FGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKME 2336 

: | I I : : I I : : : I : : : I I I : : : : : I : : : I I : I 

Db 1671 YGSGYTLLIRLKNRNDAEKTKSTIKQTFRGSVIKEEHVLQLNFDIPRDGDSWSRLFEKLE 1730 

Qy 2337 QVSGVLGIEDYSVSQTTLDNVFVNFAK 2363 

II I : I I I : I I I I I : I I : I : : 
Db 1731 TVS T S LNWDD YS L S QTT L EQVFI E FS R 1757 

RESULT 7 
A84845 

probable ABC transporter [imported] -Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Feb-2001 #sequence_revision 02-Feb-2001 #text_change 02-Feb-2001 
C;Accession: A84845 

R;Lin, X.; Kaul, S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD.; 
Fujii, C.Y.; Mason, T.M-; Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V. ; Buell, 
Ketchum, K.A. ; Lee, J. J.; Ronning, CM. ; Koo, H.; Moffat, K.S.; Cronin, 
Shen, M. ; VanAken, S.E.; Umayam, L. ; Tallon, L.J.; Gill, J.E.; Adams, 
Carrera, A.J.; Creasy, T.H.; Goodman, H.M.; Somerville, C.R.; Copenhaver, 
Preuss, D.; Nierman, W.C; White, 0.; Eisen, J. A. ; Salzberg, S.L.; Fraser, 
Venter, J.C 
Nature 402, 761-768, 1999 

A; Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis 
thaliana. 

A; Reference number: A84420; MUID: 20083487 ; PMID : 10617197 

A; Accession: A84845 

A; Status : preliminary 

A;Molecule type: DNA 

A; Residues: 1-1816 <STO> 

A;Cross-references: GB:AE002093; NID : g6598351 ; PIDN :AAC02761 . 2 ; GSPDB : GN00139 

C; Genetics: 

A; Gene: At2g41700 

A;Map position: 2 

Query Match 15.5%; Score 1964.5; DB 2; Length 1816; 

Best Local Similarity 26.9%; Pred. No. 2.6e-121; 

Matches 580; Conservative 304; Mismatches 596; Indels 677; Gaps 63; 

iHPEALNLSLDELPPALRQDNFSLPSGMALL QQLDT I DNAACGWI Q FMS - 573 

|| I : I : : : : I I : I I I : I I : I 

:HPAHSNIDKDTWEVGKGNSPSFPEVLKLLLAEGDFLAFAPDTDETNN MIDILSL 70 

KVSVDI FKGFPDEES I VNYTLNQAYQ DNVT VFAS VI FQT RKDG S L P P 620 

: : Ml | : : | : I I : : I : I : I 
vFPELRLVTKI FK DDIELETYITSAHYGVCSEVRNCSNPKIKGAWFHEQ GP 122 

IV-HYKI RQNSS FTEKTNEI RRAYWRPGPN TGGRF YFLY 658 

: I I I I : : : I I II- I 



Qy 


525 


Db 


15 


Qy 


574 


Db 


71 


Qy 


621 


Db 


123 


Qy 


659 


Db 


173 



I hi II 



698 DDFLFVIEHMMPLCIWISWVYSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITG 757 
1:1 ::: :| I 

233 DEFQS I VKSVMGL 245 

758 FVQLSISWALTAILKYGQVmHSHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASAC 817 

: | | I ::: : :: :: I I I :: I ::: I I I I 

246 FLFKY SDKTLVFTYFFLFGLSAIMLSFMISTFFTRAKTAVAV 287 

818 GGIIYFLSYVPYMYVAI REEVAHDKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQ 877 

I : : : : I I I : : : I : I I I : I I I I I I I I I I I I I I : : 

288 GTLTFLGAFFPYY T VN DE S VSMVLKWAS L L S P T AFAL G S I N FAD Y E RAH VG L R 341 

878 WHTFSQSPVEGDDFNLLIAVTMLMVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYW 937 

| : : : : : I : : : I : : : I I I : : I I I : III I : 

342 WSNIWRA S SGVS FFVCLLMMLLDS I LYCALGLYLDKVLPRENGVRYPWNFI FS KYFG 398 

938 LGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRRFEETRGMEEEPTHLPL — 98 9 

: I : I I I | : | : I : 

399 RKKNNLQ NRIPGFETD MFPADIEVNQGEPFDPVFESISLEMRQQE 443 

990 WCVDKLTKVYKDDKK--LALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFP 1043 

: | | | M : I : I I I I I I I I : : I I I II I I I I : I i : i : I M I 

444 LDGRCIQVRNLHKVYASRRGNCCAVNSLQLTLYENQILSLLGHNGAGKSTTISMLVGLLP 503 

1044 PTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRRE 1103 

I I I I I I I : I I I I I I I I I M I I I M M I III III :: II : : :: 

504 PTSGDALILGNSIITNMDEIRKELGVCPQHDILFPELTVREHLEMFAVLKGVEEGSLKST 563 

1104 MDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIW 1163 

: | I : : I I : I : : I I : I I I I I I I I I I : II : I I : I I I I I I I : I : I I I : I I 

564 WDMAEEVGLSDKINTLVRALSGGMKRKLSLGIALIGNSKVIILDEPTSGMDPYSMRLTW 623 

1164 DLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGSPLFLKGT YGDGYRLTLVK 1223 
II | | II 111:11 MM: Mill |:::| I I I I I I MM M II Mill 
624 QLI KKI KKGRI I LLTTH SMDEAEELGDRI GIMANGS LKCCGS S I FLKHH YGVGYTLTLVK 683 

1224 RPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEA 1283 

: I I : : : : I : I II: MM II 
684 TSPTVSVA AHIVHRHI PSATCVSEVGNEI S FKLP — L 718 

1284 AKKGAFERLFQHLERSL D7 5 lLHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKE 1338 

I | | : | : : | : I : MM II II II I M M M: : 

719 ASLPCFENMFREIESCMKNSDSDYPGIQSYGISVTTLEEVFLRVA GCNLDIED 771 

1339 SRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYR 1398 

::M : : I I : :| I : I I I I : I 

772 KQEDIFVSPDTKSS LVC — I G SNQKS SMQ P KLLAS CN DGAGVI I T S VAKAFR 821 

1399 PL -FDNPQDPDWSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFH 1449 

: I : M I Ml II I Ml 

822 LIVAAVWTLIGF ISIQCCGCSIISR SMFW— RHCKALFIKRAR 862 

1450 CARRNSKALFSQILLPAFFV CVAMTVALSVPEI GDLP- 1486 

I I : I : I :: I I M : M I I ': I M 

863 SACRDRKTVAFQFIIPAVFLLFGLLFLQLKPHPDQKSITLTTAYFNPLLSGKGGGGPIPF 922 

1487 PLVLSPSQY— HNYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVG 1540 



I : : M : I I I : : : : I : 

Db 923 DLSVPIAKEVAQYIEGGWIQPLRN TSYKFPNPKE 956 

Qy 1541 ATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSD 1600 

I I : I I II I I I I I : I I : I : I I : ' 

Db 957 ALADAI DAAGPTLGPTL-LSMSE — FLMSSFDQS — YQSSREGL SSHD 999 

Qy 1601 SPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHP 1660 

I II I l :: 
Db 1000 SCNHPDGSL GYT 1011 

Qy 1661 PQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVR 1720 

Db 1012 7 1011 

Qy 1721 KI AVRRAAQVF YNNKG YHSMPT YLN S LNNAI LRANL P KS KGN PAAYGI TVTNH PMNKT S A 1780 

I : I | : | | : | : : | | | | : I I I III: I 

Db 1012 VLHNGTCQHAGPIYINVMHAAILRL ATGN-KNMTIQTRNHPLPPTKT 1057 

Qy 1781 SLSLDYLLQGTDV VIAIFIIVAMSFVPASFVVFLVAEKSTKAKHLQFVSGCNPIIYW 1837 

: | | : I I : : I I I : I I I I I : I I : I I I I I : I I I I 

Db 1058 Q RI Q RH D L DAF S AAI I VN I AFS F I PAS FAVP I VKE RE VKAKHQQ LIS GVS VL S YW 1112 

Qy 1838 LAN YVWDMLN YLVPAT CCVI I L FVFDL P AYT S PTN FP AVL S L FLL YGW S I T P I MYPAS FW 1897 

I : | | | I : : : I I : I : I : : I I : I : : I I I : I I : I : 

Db 1113 LSTYVWDFISFLFPSTFAIILFYAFGLEQFIGIGRFLPTVLMLLEYGLAIASSTYCLTFF 1172 



Qy 



1898 FE VPSSAYVF LIVINLFIGITATVATFLLQLFEHDKDLKV 1937 

| : | | : I I : : : : : | | : I : I : : I 
Db 1173 FTEHSMAQATSSYSVLLPISLFVFSFSSNVILMVHFFSGLILMVISFVMGLIPATAS 1229 



Qy 1938 VNSYLK SCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMK-SPFEW 1986 

| | | | | : | : | : I I : I : I I I III 

Db 1230 ANS YLKELI LFRYALQNFFRLS PGFC.FSDGLAS LA LLRQGMKDKSSHGVFEW 1281 

Qy 1987 DIVTRGLVAMAVEGWGFLLTIMCQYNFL RRP 2018 

Db 1282 NVTGAS I CYLGLEVRLEY CRYSMLLLSFFHGIDTKLSLIYTIGASRLTELIYDRV 1336 

Qy 2019 QRMPVSTKP VEDDVDVASERQRVLRGDADNDMVKIENLTKVYKSRK-I 2065 

| I : | : I I I : I I I I I I : I : I I I : : : I I I I I I 

Db 1337 YSTSFSTEPLLKDSTGAISTDMEDDIDVQEERDRVISGLSDNTMLYLQNLRKVYPGDKHH 1396 

Qy 2066 GRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQ 2125 

| : I I I I : I I I I I I I I I I I I I : I I I : I : I : I I I I : I = = 

Db 1397 GPKVAVQSLTFSVQAGECFGFLGTNGAGKTTTLSMLSGEETPTSGTAFIFGKDIVASPKA 1456 

Qy 2126 VQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARVVKWALEKLELTKYADKPAG 2185 

: : | : I I I I I I I II : I I : I I I : I I I : : I : I I I : : I I : : I I : 

Db 1457 IRQHIGYCPQFDALFEYLTVKEHLELY7VRIKGWDHRIDNWTEKLVEFDLLKHSHKPSF 1516 

Qy 2186 T YSGGNKRKLSTAIALIGYPAFI FLDEPTTGMDPKARRFLWNLI LDL- 1 KTGR- SWLTS 2243 

| | | | | | | | I I I I I : I I I : I I I I : I I I I I I : I I : I : : I I : : I : : I : I I : 
Db 1517 TLSGGNKRKLSVAIAMIGDPPIVILDEPSTGMDPVAKRFMWDVISRLSTRSGKTAVILTT 1576 



Qy 



2244 HSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGD 
I II I : I I I I I : I I I I I II I : II I I II I : I : 



2279 



Db 



1577 HSMNEAQALCTRI GIMVGGRLRC I GS PQHLKTRYGNHLELEVP FYNGVKPNEVSNVELEN 1636 



Qy 


2280 


Db 


1637 


Qy 




Db 


169 / 


Qy 


2308 


Db 


1757 


RESULT 


8 


T47150 





-GYMITVRTK SSQSVKDWRFFNR- 2302 

: : I I : | : | : : : | 



-NFPEA 2307 
: I I I 



Ml : I I I I : I : I I I : I I : I I : I I : : I : I I 



hypothetical protein DKFZp547P193 . 1 - human (fragment) 
C; Species: Homo sapiens (man) 

C;Date: 20-Apr-2000 #sequence_revision 20-Apr-2000 #text_change 02-Sep-2000 
C; Accession: T47150 

R;Bloecker, H.; Boecher, M. ; Brandt, P.; Mewes, H.W.; Weil, B.; Wiemann, S. 

submitted to the Protein Sequence Database, March 2000 

A; Reference number: Z24376 

A;Accession: T47150 

A; Status : preliminary 

A;Molecule type: mRNA. 

A; Residues: 1-373 <AAA> 

A; Cross-references : EMBL : AL162060 

A; Experimental source: fetal brain; clone DKFZp547P193 
C; Genetics : 

A;Note: DKFZp547P193 . 1 

C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

Query Match 15.2%; Score 1920; DB 2; Length 373; 

Best Local Similarity 100.0%; Pred. No. 1.3e-119; 

Matches 373; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

KIGRILAVT)RLCLGWPGECFGLLGWGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKEL 2123 

I | | | | | | | I I I I I I I I i | I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II M I I I I I I I I 
KIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKEL 60 

LQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARVVKWALEKLELTKYADKP 2183 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
LQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARVVKWALEKLELTKYADKP 120 

AGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTS 2243 

1111111111111111111111111111 I I I I I Ill I Ml 

AGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTS 180 

HSMEECEALCTRLAIMVNGRLRCLGS I QHLKNRFGDGYMITVRTKS SQS VKDWRFFNRN 2303 

I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

HSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDWRFFNRN 240 
FPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAK 2363 

I I I II I I I M I II II I I I M M M II I M I I I I II I I I I M I I II I I I M M II I I I II I 



Qy 


2064 


Db 


1 


Qy 


2124 


Db 


61 


Qy 


2184 


Db 


121 


Qy 


2244 


Db 


181 


Qy 


2304 



Db 241 FPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAK 300 

Qy 2364 KQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDTEDEGLISFEE 2423 

I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

Db 301 KQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDTEDEGLISFEE 360 

Qy 2424 ERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I 
Db 361 ERAQLSFNTDTLC 373 



RESULT 9 
T15200 

hypothetical protein F12B6.1 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 20-Sep-1999 #sequence_revision 20-Sep-1999 #text_change 17-Mar-2000 

C;Accession: T15200 

R; Pauley, A.; Maggi, L. 

submitted to the EMBL Data Library, May 1997 

A; Description: The sequence of C. elegans cosmid F12B6. 

A; Reference number: Z18307 

A; Accession: T15200 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1447 <PAU> 

A;Cross-references: EMBL: AF003138 ; NID: g2088708 ; PID: g2088709; PIDN : AAB54 153 . 1; 
GSPDB:GN00019; CESP:F12B6.1 

A; Experimental source: strain Bristol N2 ; clone F12B6 

C; Genetics: 

A; Gene : CESP : F12B6 . 1 

A; Map position: 1 

A;Introns: 79/2; 114/3; 177/1; 224/3; 331/1; 345/3; 373/2; 417/2; 464/1; 536/1; 
659/2; 688/2; 729/3; 776/2; 889/1; 977/1; 1065/1; 1117/2; 1223/2; 1273/3 
C;Superfamily: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

Query Match 13.6%; Score 1718.5; DB 2; Length 1447; 

Best Local Similarity 27.4%; Pred. No. 3.9e-105; 

Matches 495; Conservative 268; Mismatches 542; Indels 503; Gaps 50; 

Qy 655 YFLYGFVWIQ DMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRD 698 

I : | | : : | : ::|||: I I I MM :l 

Db 19 YITFGFSFLQGSWPSLEQKSKSQKLSESIDRAIMSELTNQTDANLGVYAQQEPYPCTVKD 78 

Qy 699 DFLFVIEHMMPLCMVISWVTSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGF 758 

I : Ml : : I I : : : I : : : : I I I I I : : I I I I : I I I : I I I : : : I : 
Db 79 TFNVAL — FMPLFLLISFIFPSALLVKNIVYEKEQKIKEQMR7\MGLGDAVHFISWGLISL 136 

Qy 759 VQLSI SWALTAILKYGQVTjMHSHVVI IWLFIAWAVATIMFCFLVSVLYSKAKLASACG 818 

| Ml : : I I : : : : : : I ' ' : : I | | : : I : I : I 

Db 137 VLNFI SVLI I S 1 1 SKVAKI FDYTDYTLLLFVLI LFLFS S I AMS I FFSTLFTNANI ATAAT 196 

Qy 819 GIIYFLSYVPYMYVAI REEVAHDKITA — FEKC IAS LMSTTAFGLG S KY — - FALYEVAG 873 

: : : | : : : I : : I : I : : I : I : : : I I I I I II 

Db 197 CVLWFVFFI PFQLLRT DRISSPTFNR-ISLILPPTAMGHCFKLLESFNAMERAT 24 9 



Qy 



874 VGIQWHTFSQSPVEGDDFNLLLAVTMLMVT)AWYGILTWYIEAVTIPGMYGLPRPWYFPL 933 



I : I I I : : | : I I : I I I : I I I I I I I I I : I : : I : 1 I 

Db 250 WSDLWE — MNNPVLG — I SVELCMIMLWDTAVFLILAWYI SAVAPGDFGVRQPLWFPFT 305 

Qy 934 KSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRRFEETRGMEEEPTHLPLWCV 993 

|| | |: :::: : : : : I I I - I 
Db 306 LKYWA PGLYKNRVEFVDDEHFDTIPN SDSFDSEPTNL 342 

Qy 994 DKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYG 1053 

III: I : I III I : I I I I I I I I II I I I I I I I : I : I I : I I I 

Db 343 TLALDCLNLRLYEGQITGLLGHNGAGKTTTMSILCGLYAPSSGTAKIYQ 391 

Qy 1054 HDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL 1113 

MM:: : I I I : I II I II II II I I I : : M : I : : : I : : : : I 
Db 392 RDIRTDLRRVRDVLGICPQHNVLFSHLTVSEQLRLFAALKGVPDSELTSQVDEILASVSL 451 

Qy 1114 SNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGR 1173 

: I : I I I II II II : I : II I : II II : M M II II I I II : I I I : : I I I 

Db 452 TEKANKLASTLSGGMKRRLCIGIAFIGGSRFVILDEPTAGVDVTARKDIWKLLQRNKEGR 511 

Qy 1174 TILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQE 1233 

II I I II II II II I : I I II I I : I II MM 
Db 512 TILLSTHHMDEADVLSDRIAILSQD FEKPDLLDGKRL 54 8 

Qy 1234 PGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEA7VKKGAFERLF 1293 

: I 

Db 549 IF 550 

Qy 1294 QHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASG 1353 

I I 

Db 551 QH- 552 

Qy 1354 EGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQ 1413 

Db 553 — . — 552 

Qy 1414 EVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSK7VLFSQILLPAFFVCVA- 1472 

I : I I I I : : : : | : : : | I : : I 

Db 553 FYALLVCRINYTLKSKRTFLFQVI I PLFLLALAE 586 

Qy 1473 MTVALSVPEI-GDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSPDASPQ 1527 

: I : : I : : MM I I I : : Mil : : : : 

Db 587 LFVLLQVSTARPDLMVSMPPLPLETSIMGNHS DF— YVNS WDTAENSTAN 634 

Qy 1528 QLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLS 1587 

:: Mil: IM I : : 
Db 635 D I LHAMFS S P GT G P RCAKDVPN D LLDTMRRELMFR N 67 0 

Qy 1588 NFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVRCTCSAQG 1647 

: III I I : I I : I M I Mil I 

Db 671 RYGFGRNKPAPGVDKDSVDNEYQCQNIQ GEFDYTEDIS-NATYNAPIYCGCEDFG 724 

Qy 1648 TGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRL HRYGA1TF 1700 

: : I : I I : I : I I I : : : : I I I I I : 

Db 725 WNCTLEDWKWNETNWLRLNTTDRIFDLTGRNLTQFRLIT — RFAQLANTTAP FFLGGFS L 782 

Qy 1701 GNV LKSI PAS FGTRAP 1716 

|:| : I I II : I 



Db 



783 GHVNQRAQSQADIDTSKRGWLETIKDIAQSMRIINLNTTGIEPATPKVLDPFAQNITLNQ 842 



Qy 1717 PMVRKIAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNH 1773 

::: : I I ■ : I : : I I I : I I I : I I : I I : I II II 

Db 843 WNDLLQNLDVRENVKVWFNNKIWPGFPI ASNI LSNALLRQE — DYAI DPEDLGI LTMNH 900 

Qy 1774 PMNKTSASLSLDYLLQGTDWIAIF IIVAMSFVPAS FWFLVAEKSTKAKHLQFVS 1829 

I M II I : I I : I : I : : : : | : | | | | : | | : : : I I I I I 

Db 901 PMNKT-ISQTLDQN7U.KFTQALAVFRITILLLVLSMIPAGFTVYLVEDRICEALHLQLVG 959 

Qy 1830 GCN P 1 1 YWLAN YVWDMLN YLVPAT CCVI I L FVFDL P AYT S PTN FPAVL S L FLL YGWS I T P 1889 

I : I I : : I : : I I : : I I : : I I I II 

Db 960 GLRKVTYWVTSYLYDMVGGIHPRHHC NNAHLP-VLPCLRLYRRRRNI 1005 



Qy 1890 IMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCF 1946 

: I I : | : I I : : f I I 

Db 1006 LRLPPS ILRARNVDDSLRL — CI PKSLFCGG 1034 

Qy 1947 -LI FPNYNLGHGLMEMAYNEYINEYYAKI GQ- FDKMKS PFEWDI VTR GLVAMAVEG 2000 

I II I : : : : : I : I : : : : I I I : : : I : I I 

Db 1035 SLFCPNCN WFLLRRHSLCLDSYHARVAYGSEQMNRP DMI NQ L P L P S L LAFDQMG 1088 

Qy 2001 V VGFLLTIMC QYNFLRRPQR MPVSTKPVEDDVDVASERQRVL 2042 

: : : : : | : : I : I : : I I : I I I I I I I I I 

Db 1.089 IHIMCIiFIHVT IATICLI FSQMDEFGFVRKRERNLTDAMMLREPSTCDDEDWKERQRV- 1147 

Qy 2043 RGDA DNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTS 2096 

||. II : : | I | I : I I I : I I I I I I I I I I : I I I I I I : 

Db 1148 — DAI PMD S S DNHAL I VRN LAKAYN P ELLAVKGISFAVEPGECFGLLGLNGAGKTT 1201 

Qy 2097 TFKMLT GDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQL 2151 

II I II II : I I : : I I I I I I I I I I I : I : I I : I : 

Db 1202 TFAMLTAKIRPGHGSIEMQNTRINTGS-FSDVRNFQQ-LGYCPQFDALNMKLSTRENLKF 1259 

Qy 2152 YTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLD 2211 

I I : I II : : I II I I : : I I I I : I I I I I : I I : I : I I I I 

Db 1260 YARI RGI VPTQI DS 1 1 DRLLIALHLRP YANTQTS SLSGGNRRKLSVAVALVSQPSLI FLD 1319 

Qy 2212 EPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQ 2271 

I I : I I I I : : : I I I : I I I : I : : I I II I I I I I I I I I I I I I : I I I I I : I I I I I 
Db 1320 EPSAGMDPGSQQFLWKVIERLCKSGKAWLTSHSMEECEALCTRIAIMDRGRIRCLGGKQ 1379 

Qy 2272 HLKNRFGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKER-HHTKVQYQLKSEHISLAQ 2330 

| | | : : : | I I : I : : : : I : : : I I : : : : I : I : 

Db 1380 HLKSKYGKGSMLTMKMGKDENAKEIAGIMRSKLGDGSRVEAIHCSTIFIHIEQGTASVAR 1439 

Qy 2331 VFSKMEQV 2338 

I : I I 
Db 1440 VLEIVNQV 1447 



RESULT 10 
C88925 

protein F33E11.4 [imported] - Caehorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 10-May-2001 #sequence_revision 10-May-2001 #text_change 10-May-2001 
C;Accession: C88925 



R; anonymous, The C. elegans Sequencing Consortium. 
Science 282, 2012-2018, 1998 

A;Title: Genome sequence of the nematode C. elegans: a platform for 
investigating biology. 

A; Reference number: A75000; MUID : 99069613 ; PMID: 9851916 
A;Note: see websites genome.wustl.edu/gsc/C_elegans/ and 
www_sanger.ac.uk/Projects/C_elegans/ for a list of authors 

A;Note: published errata appeared in Science 283, 35, 1999; Science 283, 2103, 

1999; and Science 285, 1493, 1999 

A; Accession: C88925 

A; Status: preliminary 

A;Molecule type: DNA 

A; Residues: 1-1317 <ST0> 

A;Cross-references: GB:chr_V; PIDN : AAC17542 . 1 ; PID : g3158495; GSPDB : GN00023; 

CESP:F33E11.4 

C; Genetics : 

A; Gene: F33E11.4 

A; Map position: 5 

Query Match 13.3%; Score 1688; DB 2; Length 1317; 

Best Local Similarity 28.1%; Pred. No. 3.5e-103; 

Matches 441; Conservative 241; Mismatches 480; Indels 406; Gaps 31; 

Qy 858 AFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLIA^ 917 

I I : I I I I : : I I : : I : I : I I : : MINIMI 

Db 49 ALNYGLQLLAAYETQADGLKWGELFTPPSPDNNLTFGHALIALIVDGIIMIILTWYIEAV 108 

Qy 918 HPGMYGLP-RPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRRFEE 97 6 

M I : I : I I : I I I I I I I : : : : : 

Db 109 IPGGEGVPQKPWFFVL-PSYWF PNSGSKTVDSSDQFQQIQYAD 150 

Qy 977 TRGMEEEPTHLPLWCVDKLTKVY- KDDKKLALNKLSLNLYENQWS 1022 

: I : I I I I : I M I I | : | | : : | : | : | | 

Db 151 HVKLEKEPTDLIPTINWNLTKTYGTSFFKKLFDCKFGKSGEKRAVSNLNLKMYPGQCTV 210 

Qy 1023 FLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTV 1082 

I I I I I I I I : I I I : H I : I : I I I I : MM : : M : ': I : I M : I I I Ml 
Db 211 LLGHNGAGKSTTFSMLTGVASPSSGSAYVNDFDIRTSLPKIRREMGLCPQYNTLFGFMTV 270 

Qy 1083 EEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGS 1142 

M I I : : : M : I : : : I ' I : MM M M I M M Mil 
Db 271 MEHLEFFAKLKERTWDP — EEAREIUVRLRIDFKADFMAGALSGGQKRKLSLAIALIGGS 328 

Qy 1143 RAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKC 1202 

: : I I I I I M : I I I I I M : I I M I I : I I I : I I I M I M M I : : I I : I M 
Db 329 EVVMLDEPTSGMDPGARHETWTLIQREKERRTILLTTHFMEEADLLGDRIAIMAHGQLEC 388 

Qy 1203 CGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHV 12 62 

I I M : M I I I M I I I : I I M : : M ' : : : 

Db 389 CGSPMFLKQQYGDGYHLTIVY DTTSTP DVSKTTDIIREYI 428 

Qy 1263 ASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKV 1322 

: I |:|:| I : I : I I : I I : : I I I : I I M I II I I I 

Db 429 PEAHVFS YIGQEATYLL — SATHRPI FPKLFKELEDHQTQCGITSFGVSITTMEEVFLKV 486 



Qy 



1323 



SEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSA 1382 
II : : : : : 



Db 4 87 . GHTADERYNYEHGI ENDI S EMI 508 

Qy 1383 RGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVE^VEALSRVGQGSRKLDGGWLKVRQFHG 1442 

: I : I I : : I : : : | | : : : 

Db 509 EKDDPILQDLRAQV RVTGFTLQMQHAKA 536 

Qy 1443 LLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPR 1502 

: I I I | : : | : : : : : : | : : | : I 

Db 537 MFYKRAIFFFRKWTQFLPQLVFPVAYLVLMVFTSQVLPSVKEQDPQTIS 585 

Qy 1503 GNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSG 1562 

I : : : : : : III III I I : I I I 

Db 586 — LAP FS DT KKAGH LVS DSGNYVTL LGGSQNLS — 616 

Qy 1563 ESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGP 1622 

Db "617 616 

Qy 1623 EMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEY 1682 

I I I I : I : | | | | | : : 

Db 617 SMVQGTVTQLGVT — QTWDITS'-NVEKF 642 

Qy • 1683 LLFT S DRFRLHRYGAI T FG — NVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSM 1740 

: : : : Kill I : I : I I I : 

Db 643 IMDQTNAM GSRTFGLHYALGFVPSMF NFSTVSV 675 

Qy 1741 PT YLN S LNNAI LRANL P KS KGN PAAYGI TVTNH PMN KT S AS L S LD YLLQGT DWIAI FI I 1800 

Db 676 PS : LK 679 

Qy 1801 VAMSFVPAS FWFLV7VEKSTKAKHLQFVSGCNPII YWLANYVWDMLNYLVPATCCVIILF 1860 

: : : I : I I : I : I : I I : I : I I I : : I I : : I I : : : I I : 

Db 680 I S FAVCVAGYSQFLITERKKKSKHMQLLS GI RPWMFWLTAFIWDAAWFVI RI LC FDAI FY 739 

Qy 1861 VFDL P AYT S P TN F P AVL S L - FL L YGW S I T P I MY PAS FW FE VP S S AYVFL IVINLFIGI T A 1919 

: I : : I I I : I : I I I I I I I : I I I : I I :::::: I : 

Db 740 I FNITAYTHDFGVMLI LTLS FLLYGWTALPFT YWFQFFFESAPKGFMMVTMYHI LTGMI G 799 

Qy 1920 TVATFLLQLFEHDKDLKWNSYLKSCFL — IFPNYNLGHGLMEMAYNEYINEYYAKIGQF 1977 

: : I : : : : I I I : I I I I : I I : I : 

Db 800 SIAVPII SQT S S LDAGYLWS 1 1 FAWLFPT YNI SQI ATVT FQNENVRI ACKKLDCT 854 

Qy 1978 DKMKSPFEWDIVT RGL VAMAVEGWGFLLTIM 2009 

I : I I : | : : : | | : | : :: I 

Db 855 IPM FKAVTACCGTASERLWDNVLFVGNRKGILVWIFLAVQGFIYWIWVFMREN 909 

Qy 2010 CQYNFLRRPQRMPVSTKPVED D VD VAS E RQ RVL RGDADN DMVK I - EN 2055 

I: I I : | : | I II :|: I I :| : I 

Db 910 DQFTKLFALI RCRKADNP IWDI TDT DKVDERDVEDS DVI AEKS WQRLANNNKTALVSNN 969 

Qy 2056 LTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVN 2115 

III I II : I : I I I I I I I I I I I I I I I I : I I I I : I : I : I : I I 

Db 970 LVKWY GN FNAVKGVN FHVN S KDC FGLLGVN GAGKT S T FQMLTGENS I S S GDAYVN 1024 

Qy 2116 GHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLE 2175 

III : : I I I I I I I : I : : I I : : I : I I I II : I : : 

Db 1025 GWS VKNNWREAGANTGYC PQYDAI I KEMSGEET LYMFARI RGI PEKDI PKKVNAVI HAI G 1084 



Qy 

Db 



2176 LTKYADKPAGT YSGGNKRKLSTAI ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KT 2235 

: II: I I I I I I I I : I I I I : : I I : I I I I I : I : I I I I i I : I I : : I 
1085 I GMYAS RQI KT YSGGNKRRLSLGI AI VGLPDVLLLDEPTSGVDP KARRI I WNI LNRLRDL 1144 



Qy 



2236 GRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKD 2295 



I I I I I I I I • I I I I I I i I I I i i I • M i i i i • i • i • i ii • -i I • 

1145 GTALVLTSHSMDECEALCTELAIMVYGKFRCYGSCQHIKSRYGSGYTLLIRLKNRNDAEK 1204 




Db 



Qy 



Db 




Qy 



2356 NVFVNFAK 2363 



Db 



1265 QVFIEFSR 1272 



RESULT 11 
S60124 

transport protein homolog C48B4.4 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 13-Jan-1996 #sequence_revision 12-Apr-1996 #text_change 02-Feb-2001 
C;Accession: S60124; S40724; S40725 
R; Kershaw, J. 

submitted to the EMBL Data Library, November 1995 

A; Reference number: S60124 

A;Accession: S60124 

A;Molecule type: DNA 

A; Residues: 1-1767 <KER> 

A; Cross-references: EMBL:Z29117; NID:g439247; PID:gl066912 

C; Genetics : 

A;Map position: III 

A;Introns: 47/1; 112/3; 161/2; 220/2; 543/3; 574/3; 903/2; 1056/1; 1115/3; 
1178/3; 1265/2; 1331/3; 1416/3; 1703/3 

C; Super family: unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

C;Keywords: ATP; duplication; nucleotide binding; P-loop; transmembrane protein 

F; 628-818/Domain: ATP-binding cassette homology <ABC1> 

F; 645-652/Region: nucleotide-binding motif A (P-loop) 

F; 764-7 69/ Region: nucleotide-binding motif B 

F;1457-1642/Domain: ATP-binding cassette homology <ABC2> 

F; 1474-14 81/ Region : nucleotide-binding motif A (P-loop) 

F; 1586-1591/Region: nucleotide-binding motif B 

Query Match 12.0%; Score 1524; DB 2; Length 1767; 

Best Local Similarity 25.4%; Pred. No. 4.9e-92; 

Matches 530; Conservative 332; Mismatches 638; Indels 584; Gaps 75; 
Qy 447 LGLLVHLM TSNPKILY — APAGSEVDRVILKAN ETFAFVGNVT 4 87 



Db 



II i i • i • i i • i • • i i • I i • • i • ii 

103 LGPLVYLWKNADHTSSPENIYDNFQVKGTVEDVFLESNFIKPIYKRWCLRSDWVGYTS 162 



Qy 



488 HYAQWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFS 547 



Db 



I . • * i i ii i • I • i i • • i iii 

163 KDAAAKRTVDDLMKKFAE — RFQS AKLKLSVKN-ESSEEQLLTVLRND 207 



548 LPSGMALLQQLDTI DNAACGWIQFMSKVSVDI FKGFPDEES I VNYTLNQAYQDNVTVFAS 607 

I :: | : : I : I II : I II 
208 LPMLNET FC AI N S YAAGV VF DEVDVTNKKLN 238 

608 VI FQTRKDGSLPPHVHYKI RQNSS FTEKTNEI RRAYWRPGPNTGGRF YFLYG 659 

| : | : | : | : : | : I I : I : 

239 YRILLGKT-PEETWHLTETSYNPYGPSSGRYSRIPSSPPYWTSA 281 

660 FVW I Q DMME RAI IDT FVGH D WE P G S YVQMFPYPCYTR DDFLFVIEHM 7 07 

I : I : I : : : I : I : : : I I I III: 
282 FLTFQHAIESSFLSS VQSGAPDLPITLRGLPEPRYKTSSVSAFIDFFPFI 331 

708 MPLCMVISWVYSVAMTIQHI VAEKEHRLKEVMKTMGLNN AVHWVAWFITGFVQ 760 

I : : : I I : I : I : I : III: I I I I : II 

332 WAFVT FINVI H I TREI AAENHAVKP YLTAMGLST FMFYAAHVVMAFLKFFVI 383 

761 L S I S VT ALT AI L K Y GQVLMH S HWI I W L FLAVYAVAT I M FC FLVS VL Y SKAKLASA 816 

|: || :::: : :|: : : I : ::| |: : II 
384 FLCSIIPLTFVMEF VS PAALI VTVLM — YGLGAVI FGAFVAS FFNNTNS AI KAI LV 437 

817 CGGIIYFLSYVPYMYVAI REEVAHDKITAFEKC-IASLMSTTAFGLGSKYFALYEVAGVG 875 

I : : I I : | | : | : | : : I : I I I I : : I 

438 AWGAMIGISY KLRPEL — DQISS C FLYGLN INGAFALAVEAI S DYMRRERE 486 

87 6 IQ-WHTFSQSPVEGDDFNLLIJVWMLMVDAVVYGILTWYIEAWPGM 934 

: . : I : I : | : | | : | :: | | : 
487 LNLTNMFNDSSLH FS LGWALVMMI VDI L 514 

935 SYWLGSG RT EAWEW S W PWART PRLSVMEEDQACAMESRRFEETRG — 979 

I : I I I I :: I II I : I I : I I : I 

515 — WMSIGALWDHIRTSA-DFS LRTLFDFEAPEDDENQTDGVTAQNTRINEQVRNRV 568 

980 MEEEPTHLPLV VCVDKLTKVYKDDKKLAL 1008 

Ml : : | | |:: : |: 

569 RRSDMEMNPMASTSLNPPNADSDSLLEGSTEADGARDTARADIIVRNLVKIWSTTGERAV 628 

1009 NKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLG 1068 
: I I I I | | | | | | | I : I I I : I : I I : I I I I : I : I I I : : : I 

629 DGLSLRAVRGQCSILLGHNGAGKSTTFSSIAGIIRPTNGRITICGYDVGNEPGETRRHIG 688 

1069 MCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGM 1128 

I I I I : I I : I : I I I I I I II : .: : : : : I : : : I : : I I : I I I I I 

689 MCPQYNPLYDQLTVSEHLKLVYGLKGAREKDFKQDMKRLLSDVKLDFKENEKAVNLSGGM 748 

1129 KRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLL 1188 

I II I I : I : I I : : I I I I I II : I I II : : I : : I I I I I I : I I : I I I I : I 
749 KRKLCVCMALIGDSEWLLDEPTAGMDPGi\RQDVQKLVEREKANRTILLTTHYMDEAERL 808 

1189 GDRI AI I SHGKLKCCGS PLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLAS S PPGRAPLSS 124 8 
I I : I : I I I I I I : : I I : I I I I I : I : I : 
809 GDWVFIMSHGKLVASGTNQYLKQKFGTGYLLTW LDHNGDK 849 

1249 C S E LQ VS Q F I RKHVAS C LLVS DT S T ELSYILPSEAAKKGAFERLFQ 1294 

|| ::::|. I :: III hi I III 

850 RK MAVILTDVCTHYVKEAERGEMHGQQIEIILPE — ARKKEFVPLFQ 894 



1295 HLE 



RSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEA 1334 



Db 


895 


ALEAIQDRNYRSNVFDNMPNTLKSQLATLEMRSFGLSLNTLEQVFITIGDK VDKAIA 


951 


Qy 


1335 


DVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVY 
: 1 1 : 1 1 1 : 1 : 1 1 1 1 


1394 


Db 


952 


SRQNSR , ISHNSRNASEPSLKPAGYDTQSSTKSA 


984 


Qy 


1395 


GD YRP LFDN PQDPDNVS LQEVEAEAL S RVGQGS RKLDGGWLKVRQFHGLLVKRFHCARRN 
| : | I : : 1 1 1 : | | | : : | : | : I I I 
DSYQKLMD SQARGPEKSGVAKM VAQFISIMRKKFLYSRRN 


1454 


Db 


985 


1024 


' Qy 


1455 


SKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVL-SPSQYHNYTQPRGNFIPYANEER 

1 1 : 1 : 1 : 1 : : : 1 1 1 ' : 1 : : 1 
WAQ L FT QVL IPIILLGL VGSLTTLKSNNTDQFRSLT 


1513 


Db 


1025 


1060 


Qy 


1514 


REYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFD 

III: : 1: II:: 


1573 


Db 


1061 


1078 


PSGIEPSKWWRFENGTI 


Qy 


1574 


SMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPR 

1 : 1 1 : : 


1633 


Db 


1079 


PEE AANFEK 


1087 


Qy 


1634 


LVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH 


1693 


Db 


1088 


: : I : II : : : | I 

I LRKS GGF EVLNYNT KNPL 


1106 


Qy 


1694 


RYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYH SMPT YLNS LNNAI LR 
I : I I : I | : : : 1 : 1 1 : 1 1 : : : 1 1 
PNITKSL 1 GEMP PAT I GMTMN S DNLEAL FNMRY YHVL PT L I SMI N R 


1753 


Db 


1107 


1152 


Qy 


1754 


ANLPKS KGNPAAYGITVTNHPMNKTSASLSLDYLL- -QGTDWI AI FI I VAMS FVPAS FV 


1811 


Db 


1153 


II : : 1 : : III II 1 1 1 :: 1 : 1 : : 1 :: 1 1 
ARLT GT VDAE I S S GVFL YSKSTSNSNLLPSQLIDVLLAPMLILIFAMVTSTFV 


1205 


Qy 


1812 


VFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPT 

: I I : I : : : I I I : : 1 : 1 1 : : 1 : : : 1 : 1 : 1 : 1 1 1 : 1 1 : 
MFLIEERTCQFAHQQFLTGISPITFYSASLIYDGILY S L I C L I FL FMF- LAFHWMYD 


1871 


Db 


1206 


1261 


Qy 


1872 


NFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLL-QLFE 

: 1 : : 1 1 : 1 I : I 1 1 1 : 1 1 1 1 1 1 : : 1 1 : 1 1 1 : : 1 
HLAI VI LFWFL YFFS S VP FI YAVS FLFQS PS KANVLLI I WQWI S GAALLAVFLI FMI FN 


1930 


Db 


1262 


1321 


Qy 


1931 


HDKDLK — WNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDI 

1 : 1 1 : 1 1 : : : 1 : 1 1 : : Mil : IT 
IDEWLKSILVNIFM FLLPSYAFGSAIIT INTY GMILPSEELMNWDH 


1988 


Db 


1322 


1367 


Qy 


1989 


VTRGLVAMAVEGWGFLLT IMCQYN FLRR — PQRMPV STKPVEDDV DVAS 

: 1 | | | | : : | : | : | 1 I 1 : : 1 : 1 : 1 : 
CGKNAWmGTFGVCSFALFVLLQFKFWRFLSQVWTVRRSSHNWQPMMGDLPVCESVSE 


2036 


Db 


1368 


1427 


Qy 

Db 


2037 
1428 


ERQRVLRGDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVRPGEC FGLLGVNGAGKT S 
|| : I I I : : 1 : 1 :: 1 1 1 : II 1 1 : 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 
ERERVHRVNSQNSALVIKDLTKTF GRFTAVNELCLAVDQKECFGLLGVNGAGKTT 


2096 
1482 


Qy 


2097 


TFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLR 


2156 



Db 


1483 


T FN I LT GQ S FAS S GEAMI G GRDV- T EL I SIGYCPQFDALMLDLTGRESLEILAQMH 


1537 


Qy 

Db 


2157 
1538 


GI-SWKDEARWKW7VLEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTT 

1 : : | : | : : I I : : MM MM 1 1 1 - 1 MM 1 M 1 M 
GFENYKAKAELI LECVGMI AHADKLWFYSGGQKRKI SVGVALLAPTQMI I LDEPTA 


2215 
1594 


Qy 


2216 


GMDPKARRFLWNLILDLIKTGRS-WLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLK 

\: M M M M M 1 : 1 : M M M h M M M M M : : II M 1 1 1 1 
GIDPKARREVWELLLWCREHSNSALMLTSHSMDECEALCSRIAVLNRGSLIAIGSSQELK 


2274 


Db 


1595 


1654 


Qy 


2275 


NRFGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTK VQYQL-KSEHISLAQ 

: M : 1 M : 1 1 : 1 : M 1 1 : : M : 1 : : 
SLYGNNYTMTLSLYEPNQRDMWQLVQTRLPNSVLKTTSTNKTLNLKWQIPKEKEDCWSA 


2330 




1655 


1714 


Qy 


2331 


VFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQET 2374 
1 : : : : I I : M : : M : M : M 1 1 h 1 




Db 


1715 


KFEMVQALAKDLGVKDFILAQSSLEETFLRLAGLDEDQLDTHST 1758 




RESULT 
F88559 
protein 


12 
C48B4 . 


,4b [imported] - Caenorhabditis elegans 





C; Species: Caenorhabditis elegans 

C;Date: 10-May-2001 #sequence_revision 10-May-2001 #text_change 17-May-2002 
C; Access ion: F88559 

R; anonymous , The C. elegans Sequencing Consortium. 
Science 282, 2012-2018, 1998 

A;Title: Genome sequence of the nematode C. elegans: a platform for 
investigating biology. 

A;Reference number: A75000; MUID: 99069613; PMID:9851916 
A;Note: see websites genome.wustl.edu/gsc/C_elegans/ and 
www_sanger.ac.uk/Projects/C_elegans/ for a list of authors 

A;Note: published errata appeared in Science 283, 35, 1999; Science 283, 2103, 

1999; and Science 285, 1493, 1999 

A;Accession: F88559 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-1758 <STO> 

A; Cross-references: GB:chr_III; PIDN : CAA82384 . 1 ; PID : g3875025 ; GSPDB : GN00021 ; 

CESP:C48B4.4b 

C; Genetics : 

A;Gene: C48B4.4b 

A;Map position: 3 

C;Superfamily : unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

Query Match 12.0%; Score 1522.5; DB 2; Length 1758; 

Best Local Similarity 25.5%; Pred. No. 6.1e-92; 

Matches 528; Conservative 331; Mismatches 644; Indels 569; Gaps 74; 

Qy 447 LGLLVHLM TSNPKILY — APAGSEVDRVILKAN ETFAFVGNVT 4 87 

I I M M : M M : M I : I I : M : M : 

Db 103 LGPLVYLWKNADHTSSPENIYDNFQVKGTVEDVFLESNFIKPIYKRWCLRSDWVGYTS 162 

Qy 488 HYAQWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFS 547 

I : : : | | | | I M M : I : M M I 
Db 163 KDAAAKRTVDDLMKKFAE — RFQS AKLKLSVKN-ESSEEQLLTVLRND 207 



54 8 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I :: I : : I : I II : I II 

208 LPMLNET FCAI N S YAAGV VF DEVDVTNKKLN : 238 

608 VI FQTRKDGSLP PHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTGGRF ;~YFLYG 659 

1:1 : | : | : : | : I I : I : 

239 YRILLGKT-PEETWHLTETSYNPYGPSSGRYSRIPSSPPYWTSA 281 

660 FVW I Q DMME RAI I DT FVGH D WE P G S YVQMFPYPCYTR DDFLFVIEHM 707 

I : I : | : : : I : I : : : I I I III: 
282 FLTFQHAIESSFLSS VQSGAPDLPITLRGLPEPRYKTSSVSAFIDFFPFI 331 

708 MPLCMVISWVYSVAMTIQHI VAEKEHRLKEVMKTMGLNN AVHWVAWFI T GFVQ 760 

I : : : I I : I : I : I : III: III I : II 

332 WAFVTFINVIHITREIAAENHAVKPYLTAMGLSTFMFYAAHWMAFLKFFVI 383 

761 LS I SVT ALTAI LKYGQVLMHSHWT IWLFLAVYAVAT IMFCFLVS VLY SKAKLASA 816 

I : II:::: : " : I : : : I : : : I I : : I I 
384 FLCSIIPLTFVMEF VS PAALI VTVLM — YGLGAVI FGAFVAS FFNNTNS AI KAI LV 437 

817 CGGIIYFLSYVPYMYVAIREEVAHDKITAFEKC-IASLMSTTAFGLGSKYFALYEVAGVG 875 

I : : I I : | | : | : I :: I : I I I I : : I 

438 AWGAMIGISY — KLRPEL — DQ I S S - — C FL YGLN I N GAFALAVEAI S D YMRRERE 486 

876 IQ-WHTFSQS PVEGDDFNLLLAVTMLMVDAVVYGI LTWYI EAVHPGMYGLPRPWYFPLQK 934 

: : | : | : I : I I : I : : I I : 
487 LNLTNMFNDSSLH FSLGWALVMMIVDIL 514 

935 SYWLGSG RT EAWEW S W P WART PRLSVMEEDQACAMESRRFEETRGME 981 

I : I I I I : : I II I : I I : I I : I 

515 - - WMS I GALWDH I RT S A- D FS LRTLFDFEAPEDDENQTDGVTAQNTRINEQMNPMA 568 

982 EEPTHLPLV VCVDKLTKVYKDDKKLALNKLSLNLYENQ 1019 

: I : I I I : : : I : : I I I I 

569 STSLNPPNADSDSLLEGSTEADGARDTARADIIVRNLVKIWSTTGERAVDGLSLRAVRGQ 62 8 

1020 WSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDR 107 9 
I I I I I I I I : I I I : I : I I : I I I I : I : I I I : : : I I I I I : I | : | : 
62 9 CSILLGHNGAGKSTTFSSIAGIIRPTNGRITICGYDVGNEPGETRRHIGMCPQYNPLYDQ 68 8 

1080 LTVEEHLWFYSRLKSMAQEEI RREMDKMI EDLELSNKRHSLVQTLSGGMKRKLSVAI AFV 1139 

I I I I I I II : : : : : : | : : : | : : | I : I I I I I I I I I I : I : 

689 LTVSEHLKLVYGLKGAREKDFKQDMKRLLSDVKLDFKENEKAVNLSGGMKRKLCVCMALI 748 

.1140 GGSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGK 1199 

II : : I I I I I I I : I I I I : : I : : I I I I I I : I I : I I I I : I I I : I : I I I I 

749 GDSEWLLDEPTAGMDPGARQDVQKLVEREKANRTILLTTHYMDEAERLGDWVFIMSHGK 8 08 

1200 LKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIR 1259 

I I : : I I - : I I I I I : I : I : I 
809 LVASGTNQYLKQKFGTGYLLTW LDHNGDK R 839 

1260 KHVAS CLLVS DT ST ELSYILPSEAAKKGAFERLFQHLE 1297 

I : : : : I I : : I II hi I I I I I I 

840 K MAVI LTDVCTHYVKEAERGEMHGQQI EI I LPE — ARKKEFVPLFQALEAIQDRNYR 894 



1298 RSLDALHLS S FGLMDTTLEEVFLKVS EEDQS LENSEADVKES RKDVLP 134 5 

I I : I I I I I I I : I I : : : : : : : I : I I 
895 SNVFDNMPNTLKSQLATLEMRSFGLSLNTLEQVFITIGDK VDKAIASRQNSR 946 

134 6 GAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQ 1405 

947 ISHNSRNASEPSLKPAGYDTQSSTKSA DSYQKLMD 981 

1406 DPDWSLQEVEAE7VLSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLP 1465 

::| I I: III :: |:| :||| ll:|:|:| 

982 SQARGPEKSGVAKM VAQFI S IMRKKFLYSRRNWAQLFTQVLI P 1024 

1466 AFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSPDAS 152 5 

: : :| I I : I : 
1025 IILLGL VGSLTTL KSN N 1041 

1526 PQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLP 1585 

I Ml: : I : II — 
1042 TDQFSVRSLTPSGIEPSKWWRFENGTI 1069 

1586 LSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVRCTCSA 1645 

I : I | : : : : | : 
1070 PEE AANFEKILRKS 1083 

1646 QGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAITFGNVLK 1705 

II :: :| | |: | 

1084 -GGF EVLNYNTKNPL — PNITK 1102 

1706 S I PAS FGT RAP PMVRK I AVRRAAQVFYNN KGYH SMPT YLN S LNNAI LRANL P KS KGN PAA 1765 

I : I I : : : I : I I : I I : : : I III: : 

1103 SL 1 GEMP PAT I GMTMN S DNLEAL FNMRYYHVLPTLI SMI N RARLTGTVDAEIS 1155 

1766 YGITVTNHPMNKTSASLSLDYLL— QGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAK 1823 

I: : III II I I |:: I :|: : I : : I I : I I : I : : : 

1156 SGVFL YSKSTSNSNLLPSQLIDVLLAPMLILIFAMVTSTFVMFLIEERTCQFA 1208 

1824 HLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLY 1883 

I I I : : I : I I : : I : : : I : I : I : I I I : I I : : I : : I I 
1209 HQQFLTGISPITFYSASLIYDGILY-— SLICLIFLFMF-LAFHWMYDHLAIVILFWFLY 1264 

1884 GWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLL-QLFEHDKDLK — WNS 1940 

: I I : I llhllllll: : I I : I If: : I I : I I : I I 
1265 FFS SVPFI YAVS FLFQSPSKANVLLI IWQWI SGAALLAVFLI FMI FNI DEWLKSI LVNI 1324 

1941 YLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEG 2000 

: : : | : | | : : I I I I : II : I I 

1325 FM— -FLLPSYAFGSAIIT INTY GMILPSEELMNWDHCGKNAWLMGTFG 137 0 

2001 WGFLLT IMCQYN FLRR — PQRMPV- STKPVEDDV DVAS E RQ RVLRG D ADN 2048 

I I I : : I : I : I I I I : : I : I : I : I I : I I I : : I 

1371 VCSFALFVLLQFKFVRRFLSQVWTVRRSSHNNVQPMMGDLPVCESVSEERERVHRVNSQN 1430 

2049 DMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTT 2108 

: I :: I I I : II I I : I I I I I I I I I I I I I I I I I I : I I : I I I : 

1431 SAL VI KDLT KT F GRFTAVNELCLAVDQKECFGLLGVNGAGKTTTFNILTGQSFAS 1485 

2109 GGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGI - SWKDEARW 2167 



I I I : I I II: I : I I I I I I I I : I I I I I : : : : I : : I : I :: 
Db 1486 SGEAMIGGRDV-TELI SIGYCPQFDALMLDLTGRESLEILAQMHGFENYKAKAELI 1540 

Qy 2168 KWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWN 2227 

II:: : I I I I I I I I I I : I : I I : I I I I I I I : I I I I II - : I 

Db 1541 LECVGMIAHADKLVRFYSGGQKRKI SVGVALLAPTQMI I LDEPTAGI DPKARREVWE 1597 

Qy 2228 LILDLIKTGRS-VVLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVR 2286 

I : I : I : : I I I I I I : I I I I I I : I : I : : II : I I I I I : : I : I : I : 

Db 1598 LLLWCREHSNSALMLTSHSMDECEALCSRIAVLNRGSLIAIGSSQELKSLYGNNYTMTLS 1657 

Qy 2287 TKSSQSVKDWRFFNRNFPEAMLKERHHTK VQ YQ L-KSEHIS LAQVF S KMEQVS GVL 2342 

II: I : : I I I :::|: | : : I :: :: I 

Db 1658 LYEPNQRD^WVQLVQTRLPNSVLKTTSTNKTLNLKWQIPKEKEDCWSAKFEMVQALAKDL 1717 

Qy 2343 GIEDYSVSQTTLDNVFVNFAKKQSDNLEQQET 2374 

I :: I : :: I :: I : I : I I I : I 

Db 1718 GVKDFI LAQS S LEET FLRLAGLDEDQLDTHST 1749 



RESULT 13 
T42749 

ATP-binding cassette transport protein homolog - Caenorhabditis elegans 
C;Species: Caenorhabditis elegans 

C;Date: ll-Jan-2000 #sequence_revision ll-Jan-2000 #text_change 21-Jul-2000 
C;Accession: T42749 
R;Wu, Y.C.; Horvitz, H.R. 
Cell 93, 951-960, 1998 

A; Title: The C. elegans cell corpse engulfment gene ced-7 encodes a protein 
similar to ABC transporters. 

A; Reference number: Z22259; MUID: 9829734 8 ; PMID: 9635425 
A;Accession: T42749 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-1704 <WUY> 

A/Cross-references: EMBL : AF049142 ; NID : g3 17234 0 ; PIDN : AAC24116 . 1 ; PID:g3172341 
C; Genetics: 
A; Note: ced-7 

C; Super family : unassigned ATP-binding cassette proteins; ATP-binding cassette 
homology 

Query Match 12.0%; Score 1515; DB 2; Length 1704; 

Best Local Similarity 25.4%; Pred. No. 1.8e-91; 

Matches 529; Conservative 332; Mismatches 642; Indels 582; Gaps 75; 

Qy 447 LGLLVHLM — TSNPKILY — APAGSEVDRVILKAN ETFAFVGNVT 4 87 

1111:1: I I : I : : I I : I I : : I : I I : 

Db 36 LGPLVYLWKNADHTS S PEN I YDNFQVKGTVEDVFLESN FI KP I YKRWCLRS DVWGYT S 95 

Qy 488 HYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFS 547 

I : : : I I | | I : I : I : I : : I III 
Db 96 KDAAAKRTVDDLMKKFAE — RFQS AKLKLSVKN-ESSEEQLLTVLRND 14 0 

Qy 548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : : I : : I : I . I I : I I I 
Db 141 L PMLN ET FCAI N S YAAG V- — VF DEVDVTNKKLN 171 



Qy 608 VI FQTRKDGSLPPHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTGGRF YFLYG 659 

1:1 : I : I : : | : | | : I : 

Db 172 YRILLGKT-PEETWHLTETSYNPYGPSSGRYSRIPSSPPYWTSA 214 

Qy 660 FVWIQDMMERAIIDTFVGHDWEPGS YVQMFPYPCYTR DDFLFVIEHM 707 

I : I : I : : : I : I : : : I I I III: 

Db 215 FLTFQHAIESSFLSS VQSGAPDLPITLRGLPEPRYKTSSVSAFIDFFPFI 264 

Qy 708 MPLCJWISWVYSVAMTIQHI VAEKEHRLKEVMKTMGLNN AVHWVAWFITGFVQ 760 

I : : : I I : I : I : I : III: III I : II 

Db 265 WAFVTFINVIHITREIAAENHAVKPYLTAMGLSTFMFYAAHWMAFLKFFVI 316 

Qy 761 LSI SVT ALTAI LKYGQVLMHSHWI IWLFLAVYAVATIMFCFLVSVLY SKAKLASA 816 

|: II :::: : :|: : : I : ::| |: : I I 

Db 317 FLCSIIPLTFVMEF VS PAALI VTVLM — YGLGAVI FGAFVAS FFNNTNSAI KAI LV 370 

Qy 817 CGGII YFLSYVPYMYVAI REEVAHDKITAFEKC-IASLMSTTAFGLGSKYFALYEVAGVG 875 

I : : I I : | | : | : | : : | : | I I I : : I 

Db 371 AWGAMIGISY KLRPEL — DQISS CFLYGLNINGAFALAVEAI SDYMRRERE 419 

Qy 87 6 IQ-WHTFSQSPVEGDDFNLLLAWMIJy[VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQ^ 934 

: : | : | : I : I | : | : : | | : 
Db 42 0 LNLTNMFNDSSLH FSLGWALVMMIVDIL 447 

Qy 935 SYWLGSG RTEAWEWSWPWART PRLSVMEEDQACAMESRRFEETRG — 979 

I : I II I : : I II I : I I : I I : I 

Db 448 — WMS I GAL WD H I RT S A- D FS LRTLFDFEAPEDDENQTDGVTAQNTRINEQVRNRV 501 

Qy 980 MEEEPTHLPLV VCVDKLTKVYKDDKKL 1006 

II:: : | | | : : : 

Db 502 RRSDMEIQMNPMASTSLNPPNADSDSLLEGSTEADGTVRDTT^ADIIVRNLVKIWSTTGER 561 

Qy 1007 ALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKN 1066 

I:: Ml I I I I I I I I I : I I I : I : Ihl II |:|: I I |:: 

Db 562 AVDGLSLRAVRGQCSILLGHNGAGKSTTFSSIAGIIRPTNGRITICGYDVGNEPGETRRH 621 

Qy 1067 LGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSG 1126 

: I I I I I : I I : I : I I I I I I II :::::: | ::: | :: | | : I I I 

Db 622 IGMCPQYNPLYDQLTVSEHLKLVYGLKGAREKDFKQDMKRLLSDVKLDFKENEKAVNLSG 681 

Qy 1127 GMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEAD 1186 

1111111:1:11 :: I I I I I I I : I I I I : : I ■: : I I I I I I : I I : I I I I : 
Db 682 GMKRKLCVCMALIGDSEVVLLDEPTAGMDPGARQDVQKLVEREKANRTILLTTHYMDEAE 741 

Qy 1187 LLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPL 1246 

111:1:11111 I : : I I : I I I I I : I : I : 
Db 742 RLGDWVFIMSHGKLVASGTNQYLKQKFGTGYLLTW LDHNGDK 784 

Qy 1247 S S CS ELQVS QFI RKHVAS CLLVS DT ST ELSYILPSEAAKKGAFERL 1292 

II : : : : I I : : I I I hill 

Db 785 RK MAVILTDVCTHYVKEAERGEMHGQQI EI I LPE — ARKKEFVPL 827 

Qy 1293 FQHLE RSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENS 1332 

II I I I I : I I II Mhll::::::: 

Db 828 FQALEAIQDRNYRSNVFDNMPNTLKSQLATLEMRSFGLSLNTLEQVFITIGDK VDKA 884 



Qy 



1333 EADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSAS SVGSARGDEGAGYTD 1392 



I : I I : I II: I : II II 

Db 885 IASRQNSR ISHNSRNASEPSLKPAGYDTQSSTKSA 919 

Qy 1393 VYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCAR 1452 

I : I I : : I I I : I I I : : I : I : I 

Db 920 — DSYQKLMD : SQARGPEKSGVAKM VAQFISIMRKKFLYSR 957 

Qy 1453 RNSKALFSQILLPAFEVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEE 1512 

II I I : | : I : I : : Mil : I 

Db 958 RNWAQLFTQVLI PI I LLGL VGSLTTL KSN 986 

Qy 1513 RREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFF 1572 

: I III: : I : I I :: 
Db 987 NTDQFSVRSLTPSGIEPSKWWRFENGTI 1015 

Qy 1573 DSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLP 1632 

I : I I : 

Db 1016 PEE AANFE 1023 

Qy 1633 RLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRL 1692 

: : : I : II : : : I I 
Db 1024 KILRKS GGF EVLNYNTKNPL 1043 

Qy 1693 HRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVF YNN KGYH SMPT YLN S LNNAI L 1752 

, I : I I : I I : : : I : I I : I I : : : I 
Db 1044 PNITKSL 1 GEM P PAT I GMTMN S DN L EAL FNMR Y YH VL PTL I SMI N 1088 

Qy 1753 RANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLL — QGTDWIAI FIIVAMSFVPASF 1810 

III : : | : : III | | | I I : : I : | : : I : : I 

Db 1089 RARLTGTVDAEISSGVFL YSKSTSNSNLLPSQLIDVLLAPMLILIFAMVTSTF 1141 

Qy 1811 WFLVAEKSTKAKHLQFVSGCNP 1 1 YWLAN YVWDMLN YLVPATCCVI I LFVFDLPAYTS P 1870 

I : I I : I : : : I I I : : I : I I : : I : : : I : I : I : I I I : I I : 
Db 1142 VMFLIEERTCQFAHQQFLTGISPITFYSASLIYDGILY S L I C L I FL FMF- LAFHWM Y 1197 

Qy 1871 TNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLL-QLF 1929 

: I : : I I : I I : I I I I : I I I I I I : : I I : I I I : : I 
Db 1198 DHLAI VI LFWFLYFFS S VP FI YAVS FLFQS P S KANVLLI IWQWI S GAALLAVFLI FMI F 1257 

Qy 1930 EHDKDLK— WNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWD 1987 

I : I I : I I : : : I : I I : : III I : II 

Db 1258 NIDEWLKSILVNIFM FLLPSYAFGSAIIT INTY GMILPSEELMNWD 1303 

Qy 1988 IVTRGLVAMAVEGWGFLLTIMCQYNFLRR— PQRMPV STKPVEDDV DVA 2035 

: I I I I I :: I : I : I I I I : : I : I : I : 

Db 1304 HCGKNAWLMGT FGVCS FAL FVLLQ FK FVRRFLS QVWTVRRS S HNN VQ PMMGDL P VCE S VS 1363 

Qy 2036 SERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKT 2095 

II : I I I : : I : | : : | | | : | | I I : I I I I I I I I I I I I I I I I I I 

Db 1364 EERERVHRVNSQNSALVIKDLTKTF GRFTAVNELCLAVDQKECFGLLGVNGAGKT 1418 

Qy 2096 STFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRL 2155 

: II : I I I : | | | : | | | | : I : II I I I I I I : I I I I I : : : : 

Db 1419 TTFNILTGQSFASSGEAMIGGRDV-TELI SIGYCPQFDALMLDLTGRESLEILAQM 1473 

Qy 2156 RGI-SWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPT 2214 

I ::| :| :: II : : :IM MM |||:| :||: I Mill 



Db 



1474 HGFENYKAKAELI LECVGMIAHADKLVRFYS GGQKRKI S VGVALLAPTQMI I LDEPT 1530 



Qy 2215 TGMDPKARRFLWNLILDLIKTGRS-WLTSHSMEECEALCTRLAI^4VNGRLRCLGSIQHL 2273 

I : I I I I I I : I I : I : I : : I I I I I I : I I I I I I : I : I : : II Ml I I 
Db 1531 AGIDPKARREVWELLLWCREHSNSALMLTSHSMDECEALCSRIAVLNRGSLIAIGSSQEL 1590 

Qy 2274 KNRFGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTK VQYQL-KSEHISLA 2329 

I : : I : I : I : I I : I : : I I I : : : | : I : : 

Db 1591 KSLYGNNYTMTLSLYEPNQRDMVVQLVQTRLPNSVLKTTSTNKTLNLKWQIPKEKEDCWS 1650 

Qy 2330 QVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQET 2374 

I : : : : | | : : | : : : | : : | : | : I | | : I 

Db 1651 AKFEMVQALAKDLGVKDFILAQSSLEETFLRLAGLDEDQLDTHST 1695 



RESULT 14 
T00826 

hypothetical protein T32G6.22 - Arabidopsis thaliana (fragment) 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 12-Feb-1999 #sequence_revision 12-Feb-1999 #text_change 23-Mar-2001 
C;Accession: T00826 

R;Rounsley, S.D.; Lin, X.; Ketchum, K.A. ; Crosby, M.L.; Brandon, R.C.; Sykes, 
S.M.; Kaul, S.; Mason, T.M.; Kerlavage, A.R.; Adams, M. D . ; Somerville, C.R.; 
Venter, J.C. 

submitted to the EMBL Data Library, November 1997 

A; Description: Arabidopsis thaliana chromosome II BAC T32G6 genomic sequence. 
A; Reference number: Z14163 
A;Accession: T00826 

A; Status: translated from GB/EMBL/DDBJ 

A; Molecule type: DNA 

A; Residues: 1-1246 <ROU> 

A/Cross-references: EMBL:AC002510; NID : g2618683; PID:g2618705 

A; Experimental source: cultivar Columbia 

C;Genetics: 

A; Map position: 2 

A;Introns: 33/3; 95/2; 113/3; 137/3; 168/3; 361/3; 421/2; 432/3; 493/3; 521/3; 
535/3; 568/1; 630/3; 665/3; 726/3; 771/2; 818/3; 864/1; 890/3; 926/3; 977/1; 
1022/3; 1053/3; 1155/1; 1199/3; 1220/2 
A;Note: T32G6.22 

Query Match 11.4%; Score 1448.5; DB 2; Length 1246; 

Best Local Similarity 27.9%; Pred. No. 2.7e-87; 

Matches 415; Conservative 200; Mismatches 388; Indels 485; Gaps 42; 

Qy 1113 LSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPG 1172 

I I : I : : I I : I I I I I I I I I I : II : I I : I I I I I IT: I : I I I : I I I I I I I 
Db 3 LSDKINTLVT^ALSGGMKRKLSLGIALIGNSKVIILDEPTSGMDPYSMRLTWQLIKKIKKG 62 

Qy 1173 RTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQ 1232 

I I I I : I I MM: I I I I I I : : : I I I I I I I : I I I II II I I I I I 
Db 63 RIILLTTHSMDEAEELGDRIGIMANGSLKCCGSSIFLKHHYGVGYTLTLVK 113 



Qy 1233 EPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERL 1292 

: I I : : : : I : I I I : I : I : I I I I I : 
Db 114 TSPTVSVA AHIVHRHIPSATCVSEVGNEISFKLP — LASLPCFENM 157 

Qy 1293 FQHLERSL DALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGA 1347 



I : : I : I : I : I : I I I I I I I I : I : I : : : : : I : 

Db 158 FREIESCMKNSDSDYPGIQSYGISVTTLEEVFLRVA GCNLDIEDKQEDIFVSP 210 

Qy 1348 EGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPL : 1400 

: : I | : : | | : | | | | : I : 

Db 211 DTKSS LVC — I G SNQKS SMQ P KLLAS CN DGAGVI I T S VAKAFRL I VAAVWT L 260 

Qy 1401 --FDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKAL 1458 

I : I : I : I I I I I : I I I I : I ' : 

Db 261 IGF ISIQCCGCSIISR SMFW RHCKALFI KRARSACRDRKTV 301 

Qy 1459 FSQILLPAFFV CVAMTVALSVPEI GDLP PLVLS 1491 

I : : I I I : : : I I I : I : I I : 

Db 302 AFQFIIPAVFLLFGLLFLQLKPHPDQKSITLTTAYFNPLLSGKGGGGPIPFDLSVPIAKE 361 

Qy 1492 PSQY— HNYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPA 1549 

: I I : I I I " : : : : | : | I 

Db 362 VAQYIEGGWIQPLRN TSYKFPNPKEALADAIDAA 395 

Qy 1550 NGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDL 1609 

: I I I I I I I I I : I I : I : I I : I I III 

Db 396 GPTLGPTL-LSMSE— FLMSSFDQS — YQSSREGL SSHDSCNHPDGSL 438 

Qy 1610 QAWNVS L P P TAG P EMWT SAP S L P RLVRE P VRCT C S AQ GT G F S C P S S VGGH P P QMRWT GD 1669 

I : : 

Db 439 GYT 441 

Qy 1670 ILTDITGHNVSEYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQ 1729 

Db 442 441 

Qy 1730 VFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQ 17 8 9 

I : I | : | | : | : : | | | | : I I I I I I : I : I 

Db 442 VLHNGTCQHAGPIYINVMHAAILRL ATGN-KNMTIQTRNHPLPPTKTQ RIQ 491 

Qy 1790 GTDV VI AI FI I VAMSFVPAS FVVFLVAEKSTI<7VKHLQFVSGCNPI I YWLANYVWDML 1846 

I : I I : : I I I : I I I I I : I I : I I I I I : I I : : I I I : I I I I : 

Db 492 RHDLDAFSAAIIVNIAFSFIPASFAVPIVKEREVKAKHQQLISGVSVLSYWLSTYVWDFI 551 

Qy 1847 NYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFE 1899 

: : I I : I : I : : I I : I : : | | I : I I : I : I 

Db 552 S FLFPSTFAI I LFYAFGLEQFI GI GRFLPTVLMLLEYGLAI AS STYCLTFFFTEHSMAQA 611 

Qy 1900 VPSSAYVF LIVINLFIGITATVATFLLQLFEHDKDLKWNSYLK 1943 

: I I : I I : : : : : | | : I : I : : I I I I I I 

Db 612 TS S YS VLLP I S LFVFS FS SNVI LMVHFFS GLI LMVI S FVMGLI PAT AS ANSYLKELI 668 . 

Qy 1944 SCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMK-SPFEWDIVTRGLVA 1995 

: I : I : II : I : I II III:: : 

Db 669 LFRYALQNFFRLSPGFCFSDGLASLA LLRQGMKDKS SHGVFEWNVTGAS I C Y 720 

Qy 1996 MAVEGWGFLLTIMCQYNFL RRPQRMPVSTKP 2027 

: : I : : I : I : I I I I : I 

Db 721 LGLEVRLEY CRYSMLLLSFFHGIDTKLSLIYTIGASRLTELI YDRVYSTSFSTEP 775 



Qy 



2028 



■ VEDDVDVAS ERQRVLRGDADNDMVKI ENLTKVYKS RK- 1 GRI LAVDRL 2074 
: I II : I I I I I I : I : I I I : : : I I I I I I I : I I I 



Db 776 LLKDSTGAISTDMEDDIDVQEERDRVISGLSDNTMLYLQNLRKVYPGDKHHGPKVAVQSL 835 



Qy 2075 CLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCP 2134 

I : I I I I I I I I I I I I I : I I I : I : I : I I I I : I :: ::| :IMI 

Db 836 TFSVQAGECFGFLGTNGAGKTTTLSMLSGEETPTSGTAFIFGKDIVASPKAI RQHIGYCP 895 

Qy 2135 QCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRK 2194 

I I I I I : I I : I I I : I I I : : I : II I : : I I : : I I : I I I I I I I I 

Db 896 QFDALFEYLTVKEHLELYARIKGWDHRIDNWTEKLVEFDLLKHSHKPSFTLSGGNKRK 955 

Qy 2195 LSTAI ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDL- 1 KTGR- S WLTSHSMEECEAL 2252 

11111:111 : I I I I : I I I I I I : I I : I : : I I : : I : : I : I I : I I I I : I I * 
Db 956 LS VAI AMI GDP P I VI LDEP STGMDP VAKRFMWDVI S RLST RSGKTAVI LTTH SMNEAQAL 1015 

Qy 2253 CTRLAIMVNGRLRCLGSIQHLKNRFGD : GYM 2282 

I I I : I I I I I I I I : I I I I I I I : I : - 

Db 1016 CTRIGIMVGGRLRCIGSPQHLKTRYGNHLELEVPFYNGVKPNEVSNVELENFCQIIQQWL 1075 

Qy 2283 ITVRTK SSQSVKDWRFFNR ■ 2302 

||: | : | : : : | 

Db 1076 FNVPTQPRSLLGDLEVCIGVSDSITPDTASASEISLSPEMVQRIAKFLGNEQRVSTLVPP 1135 

Qy 2303 NFPEAMLKERHHTK 2316 

ill I I : 

Db 1136 LPEEDVRFDDQLSEQLFRDGGIPLPIFAEWWLTKEKFSALDSFIQSSFPGATFKSCNGLS 1195 

Qy 2317 VQYQLK — S EHI S LAQVFS KMEQVS GVLGI ED YS VSQTTLDNVFVN FA 2362 

: : I II : I I I I : I : I I I : I I : I I : I I : : I : I I 

Db 1196 IKYQLPFGEGGLSLADAFGHLERNRNRLGIAEYSISQSTLETIFNHFA 1243 



RESULT 15 
T27121 

hypothetical protein Y53C10A. 9 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 04-Mar-2000 
C; Accession: T27121 
R;White, S. 

submitted to the EMBL Data Library, November 1998 
A; Reference number: Z20314 
A; Accession: T27121 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1564 <WIL> 

A;Cross-references: EMBL: AL033536; PIDN : CAA22 142 . 1 ; CESP : Y53C10A. 9 
A; Experimental source: clone Y53C10A 
C; Genetics: 

A; Gene: CESP : Y53C10A. 9 

A;Introns: 43/3; 92/2; 148/2; 226/3; 354/1; 712/3; 817/1; 875/1; 916/3; 984/3; 

1069/2; 1133/3; 1179/1; 1224/3; 1253/3; 1317/3; 1339/3; 1375/2; 1511/3 

C; Superf amily : unassigned ATP-binding cassette proteins; ATP-binding cassette 

homology 

Query Match 9.5%; Score 1202.5; DB 2; Length 1564; 

Best Local Similarity 24.4%; Pred. No. 9.5e-71; 

Matches 444; Conservative 280; Mismatches 544; Indels 549; Gaps 61; 



Qy 685 SYVQMFPYPCYTRDDF ~ — LFVI EHMMP LCMVI S WVY SVAMT I QH I VAEKEH 733 

i : : ] I : : I I I : : I I I I I I : : I If 

Db 175 S 1 1 ENFDI KLYS ITNFEESGFGQNFGLLFAVCMLMP VISVA RALWEKS- 223 

Qy 734 R L K E VMKTMG LN N AVHWVAW F I T G FVQ LSI S VT AL 7 68 

: I : I : I I : : : I : I : : : : I I 

Db 224 SVKPYLTTIGLPLWMFYLEHFLFGVI KNTFLITLLSTLYIFSMDNCPTYVLAGIFMYTCH 283 

Qy 769 TAILKYGQVLMHSHWI IWLFLAVYAVATIMFCF LVSVLYSKAKL 813 

I : I I : I : : : =1111= I • : : I I I : : M 

Db 284 CVSFSILCTSILPFGKRIVEG-MVIIWITLIIAMHLSLEFEFDWLFWVPLLNPNYSLKLF 342 

Qy 814 ASACGGIIYFLSYVPYMYVAIREEVAHDKITAFEKCIASLMSTTAFGLGSKYFALYEVAG 873 

I 11:1 I : I I : 

Db 343 VDAT FLASGPN GTPTSALF 361 

Qy 874 VGIQWHTFSQSPVEGDDFNLLLAWMLMVDAVVY — GILTWYIEAVHPGMYGLPRPWYFP 931 

: I : I I :::: |::|: I : : : |: |: 

Db 362 -SSKKKTLQSAAVY FGIMI SCTWMLVAAI FMEKLYTFVGHAI F 404 

Qy 932 LQKSYWLGSGRTEAWEWSWPWARTPRLSVME EDQACAMESRRFEETRGMEEEPTH 98 6 

| : | | : : : | : | | | : : : : : III 

Db 405 — KRFWRILG FSKGKRSKI EERGDGVEDRSTI LQCKETVEGRGSAI ADI E 452 

Qy 987 LPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTS 1046 

I I I I I : : : I I : I I I I II I I I I I I I : I I : : I I : I 

Db 453 L— S GLVKVYQNGEK-AVNGLS LRAI RGQVS I LLGHNGCGKSTT FGMI TGMHQATE 505 

Qy 1047 GSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMA-QEEI RREMD 1105 

I I I I I I : : I I I I : I : : I I I I I I I : I I : : : : : 

Db 506 GKVMIGGIDANANRAEARELIGYCPQYNPIYDELTVWEHLRLVNALKGRSGGSDFKMDAE 565 

Qy 1106 KMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDL 1165 

::: :||::||::| : Ml I I III I I :| HIM hllllllhll II : : = 

Db 566 SLLKQI ELTDKRNTLAKNLSGGMKRKLCVCMAMI GGSRVT LLDEPTAGMDPSARI DVQNM 625 

Qy 1166 ILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLV-KR 1224 

: I I I I I I : I I : I I I I : I I I I :: I I II : II =11 I I I I I I I I 

Db 626 LAIjVKADRTILLTTHYMDEAEKLGDWIFVMSH 685 

Qy 1225 PAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYI 1278 

: I I : : : : : I I I I I I : : I 

Db 686 VHDPMRPRK S YET AYDVC KT VC S T ALVKD ERGQMI E I S I LET E 728 

Qy 1279 LPS EAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEE 1325 

||: I I : I : : I I : : I : : : I I : I I : I : ■ : I 

Db 729 KSRLPTLLKILESVMEEDYNNPEFQALEPDIQEKCRTLELATIGVSMSSLEQVFIKIGDE 788 

Qy 1326 -DQSLENSEADWESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARG 1384 

I : : I I I : : : I I I : 

Db 789 CDDIMNGTGVDKKTERQE KFSTLVQYKI 816 

Qy 1385 DEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLL 1444 

I I II II : II 

Db 817 QQPK QGFSKL MMWWALL 834 



Qy 



1445 VKRFHCARRNSKALFSQILLPA FFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 



Db 



II : II : 11:11 I : : |:: h I I I I : I 

835 QKRAYYLYRNPVQITLQIILPLLTLWLFAVPFLRLEPKPPKLSDIES-- FDPSQYPHST- 8 91 



Qy 1501 PRGNFI PYANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPTLNLS 1560 

III 

Db 892 : VLLQLEN 898 

Qy 1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLP'PTA 1620 

: II I : : I III 

Db 899 ENDDRL— ANYLNS FSNF 914 

Qy 1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

III | | : | : : | | I : 

Db 915 EWFKT LGFIVKVNKKGDSKFYKISQGD KNAA 946 

Qy 1681 E YL L FT S D R FRLH R Y GAI T FGN VL K S I PAS FGT RAP PMVRK I AVRRAAQ VF YNN KG YH SM 174 0 

: I : I I I III 

Db 947 1 LMNI IASAMYLRDPSVTK 965 

Qy 1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 

I I I I I I : : I : I I : 

Db 966 LPHVTSRVIWMNDPKIK YEGLASFFLFEN IFFL 998 

Qy 1801 VAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLANYWDMLNYLVPATCCVIILF 1860 

: : : : I : I : I I I I I : : : I : I I I : : I I I II 
Db 999 LVLAGIFIQSTVYLIEEKICKFAHQQYLTGLSTIAYWGWFLWDFL LF 1046 

Qy 1861 VFDLPAYTSPTNFPAVLSLFLLYG — WSITPIMYPASFWFEVPSSAYVFLIVINLFIGIT 1918 

I I I I : : | : | | I I I I : I I : : I I I 
Db 1047 TFFL-LYT 1 GFLI S FGVLQGHI HEI WI FYGLLFYF — APLVYLTSALIN T 1094 

Qy 1919 ATVATFLLQLF EHDKDLKWNSYLKSCFLIF- PNYNLGHGLMEMAY 1963 

I I I I : I :::::: | | | | : I I I : : I 

Db 1095 PTRGNFLLYMFCCIPWIiAYSIVSELHNFPPIQKYSDEIEYGFRIFNPSIGFLAGLMKIA- 1153 

Qy 1964 NEYINEYYAKIG QFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIM 2009 

: I III I :::::: : I : : I : III: 
Db 1154 — ALN — YPKSGLDKHFEHLTNLWTYEGIFFELMFLFFGGI FLTILLGCATLKPFRR 1206 

Qy 2010 -CQYNFLRRPQ-RMPVSTKPVEDDVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGR 2067 

I I I I I 1:1 I I I I | : : : | : | | : I : 

Db 1207 ACFRGTRRRSQPRERKRYKGIESCKAVKEEEQLVQEVDKNETVLVIDGLVKDF — GK 1261 

Qy 2068 ILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQ 2127 

I I : I : I I I I I : I I I I I I I I : I I :: I I I I I I : : I I : : : 
Db 1262 FRAVNDLSISVGHEECFGMLGANGAGKTTTFDIITGLTMPTGGSATIDGHDITETI 1317 

Qy 2128 QSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARVVKWALEKLELTKYADKPAGTY 2187 

: I I I I I I I : : : : I : I : : : I : I : : I I : I : : : : : I 
Db 1318 -HIGYCPQFDAMLQQISCRQTLRIMAKLQG — YPNVKEWELVLDCVGMSDFGYKLVKNC 1374 

Qy 2188 SGGNKRKLSTAI7VLIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGR-SWLTSHSM 2246 

111111:1 III: I I I I I I I : I I : I I I : I : : I : : : : I : I I I I I I I 

Db 1375 SGGQKRKISVGI7VLMSRATCIILDEPTAGIDPRARREIWDIIHEMREQAKCSIVLTSHSM 1434 

Qy 2247 EECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDWRFFNRNFPE 2306 

I I I I I I I I I : I : I : I I : I I I : : : I : I I : I : I : : : I : I : 



) 



Db 1435 EECEALCTRIGILRKGEMIALGTSQSLKSQYGNTYMMTLILNSLEDLESVCVIVSEEMPD 1494 

Qy 2307 AMLKERHH TKVQYQL-KSEHIS1AQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFA 2362 

| : | | | : : : | | | : :::::: | : : : I I : : I : I : : I : 

Db 1495 AVLKTPESSLTTSIVWELPKSKSDKWSEKYNQVEVLAKKANAKDYMLTQASLEDTFIRL- 1553 

Qy 2363 KKQSDNLEQQETEPPSA 2379 

: : I I II 
Db 1554 ITTEEEEEASA 1564 

Search completed: September 1, 2004, 10:58:17 ( 
Job time : 94 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein 



Run on: 



protein search, using sw model 
September 



1, 2004, 10:57:18 ; Search time 217 Seconds 
(without alignments) 
3531.784 Million cell updates/sec 



Title: US-10-088-467-2 

Perfect score: 12668 
Sequence : 

Scoring table: 



1 MGFLHQLQLLLWKNVTLKRR GLISFEEERAQLSFNTDTLC 2436 

BLOSUM62 
Gapop 10.0 , Gapext 0.5 



Searched: 1297172 seqs, 314612898 residues 

Total number of hits satisfying chosen parameters: 



1297172 



Minimum DB seq length: 
Maximum DB seq length: 



0 

2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : 



Published_Applications_AA: 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 



/cgn2_6/ptodata/l/pubpaa/US07_PUBCOMB.pep:* 
/cgn2_6/ptodata/l/pubpaa/PCT_NEW_PUB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US06_NEW__PUB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US06_PUBCOMB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB.pep:* 
/ cgn2_6/ptoda ta/ 1 /pubpaa/ PCTUS_PUBCOMB . pep : 
/cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US09A_PUBCOMB.pep: 
/cgn2_6/ptodata/l/pubpaa/US09B_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep 
/cgn2_6/ptodata/ l/pubpaa/US09_NEW_PUB . pep : 
/cgn2_6/ptodata/l/pubpaa/USlOA_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/USlOB_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/US10C_PUBCOMB.pep 
/cgn2_6/ptodata/l/pubpaa/US10_NEW_PUB.pep: 
/cgn2_6/ptodata/l/pubpaa/US60_NEW_PUB.pep: 
/cgn2_6/ptodata/l/pubpaa/US60_PUBCOMB.pep: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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/I TOO c 

4Z0o . 3 


9 9 
00 . 


4 


9 9 61 
Z Z D X 


1 0 

A. U 


Uq_nQ-QR4-R9 7-1 ^1 

UO U_7 J U 4 OZ/ xox 


Sequence 


131, App 


Zo 


/l O Q 9 

4Z0o . o 


9 9 
00 . 


A 
4 


9 9 61 
ZZDX 


1 0 
X u 


nq-0Q-QR4-R97-1 34 

Uo U ZJ _/OH OZ/ XOM 


Sequence 


134, App 


Z4 


4Z OZ . 0 


9 9 
OO . 


A 

4 


9961 
ZZDX 


1 0 
X u 


TTc_nQ-QR4-R97-1 9R 

Uo U.7 _/0*± OZ/ xzu 


Sequence 


128, App 


Z5 


/I O Q O C 

4Z0Z . o 


9 9 

0 0 . 


A 

4 


99 61 
ZZDX 


1 0 
X u 


nq_nQ-QR4-R97-19Q 

Uo U-7 ~7 O'-i. O^l X£^Z7 


Sequence 


129, App 


9 C 
Z b 


/I 9 9 9 ^ 

4Z0Z . 3 


99 
OO . 




9 9 61 
ZZDX 


1 0 
X u 


TTc„nQ-QR4-R97-133 

UO \J ZJ _70*± (J C 1 XOO 


Sequence 


133, App 


O 9 
Z / 


A 9 99 ^ 
4 Z OZ . D 


99 
O O . 


*1 


9961 


1 0 

J. \J 


ijq-09-984-827-136 

VJO \J -J »J 1. \J C+ 1 -L — / w 


Sequence 


136, App 


o o 

Z o 


/I 9 9 9 P. 
4Z0Z - 3 


9 9 
00 . 


4 


996^ 
ZZ DO 


1 9 
J. z 


tto„i 0-976-774-2326 

Uo XU Z/U //*± A J^. U 


Sequence 


2326, Ap 


z y 


4ZoU . 3 


9 9 
00 . 


/i 

. 4 


99 61 
ZZDX 


1 0 
X u 


tto_0Q— QR4-R97-1 ^5 

Uo U_7 ^0*± OZ/ X O — J 


Sequence 


135, App 


oU 


/I 9 9 Q c; 

4zz y . 3 


9 9 
0 0 . 


A 

, 4 


9 9 61 
ZZDX 


1 o 
X u 


TTc_nQ — QR4-R97-1 ^9 

UO U-7 ^01 OZ/ xoz 


Sequence 


132, App 


ol 


/l 9 9 9 ^ 

4ZZo . 3 


9 9 
00 . 


9 

. 0 


9 9 61 
ZZDX 


1 0 
X u 


iTc_0Q-Qfl4-fi?7-1 30 

Uo U_7 -70M OZ/ XOU 


Sequence 


130, App 


9 9 
OZ 


A 1 Q9 R 

4 iyz . o 


9 9 
OO . 


i 

. X 


9961 

ZZDX 


1 4 


tto_-i 0 — ^40 — 097 — 118 

UO XU OTU \J Z7 f XXU 


Sequence 


118, App 


9 9 


A 1 Q9 R 

4 iyz . o 


9 9 
OO « 


. 1 


9 9 61 
ZZDX 


1 4 


TT9-1 0-^^6-91 5-1 1 8 
uo XU oou zxo xxu 


Sequence 


118, App 


9 yl 

o4 


A T Q9 R 

4 iyz . 3 


9 9 
00 . 


. 1 


9 9 61 
ZZDX 


1 4 

X *i 


tto_i 0-^^6-91 Q-1 1 fl 

UO XU OOU £i. X ZJ ■ xxo 


Sequence 


118, App 


Q C 
JJ 


A 1 A 1 

4 141 


99 
OZ « 


7 


9901 
Z Z U X 


1 9 

J. z 


ttc_-i n„-i 70-^85-293 

UO XU X/U OOO £. ZJ ~J 


Sequence 


293, App 


0 D 


/I 1 /l 1 
414 1 


99 
OZ . 


7 


9901 

Z Z U X 


1 S 


tto„i n_99i -4Q6A-29 

UO XU OOX 1 ^ vJ^T. 


Sequence 


29, Appl 


97 
0 / 


4141 


OZ . 


7 


99D1 

Z Z U _L 


1 6i 


US-1 0-429-160-4 

UO XU a ^* J.UU i 


Sequence 


4, Appli 


9 Q 

oo 


/lino 
4 1U0 


99 
OZ . 


A 
► 4 


9901 
Z Z U X 


Q 


ttc_0Q-QQS-54 9-9 

U O U _7 ZJ ZJ O.M Z J 


Sequence 


9, Appli 


9 Q 


oy oi 


9 1 

O J. , 


9 
. z 


9^10 


Q 


US- 09-99 5-542-10 

UO W ZJ ZJ ZJ O — ' " XU 


Sequence 


10, Appl 


40 


3875.5 


30, 


.6 


2273 


12 


US-10-182-006-6 


Sequence 


6, Appli 


41 


3857.5 


30. 


.5 


2273 


9 


US-09-995-542-12 


Sequence 


12, Appl 


42 


3834.5 


30 


.3 


2273 


14 


US-10-340-097-3 


Sequence 


3, Appli 


43 


3834.5 


30 


.3 


2273 


14 


US-10-336-215-3 


Sequence 


3, Appli 


44 


3834.5 


30 


.3 


2273 


14 


US-10-336-219-3 


Sequence 


3, Appli 


45 


3834.5 


30 


.3 


2273 


15 


US-10-295-027-1279 


Sequence 


1279, Ap 



ALIGNMENTS 



RESULT 1 
US-10-380-727-2 

; Sequence 2, Application US/10380727 
; Publication No. US20040024183A1 
; GENERAL INFORMATION: 

; APPLICANT: INCYTE GENOMICS, INC.; LEE, Ernestine A.; 



APPLICANT: YUE, Henry; LAL, Preeti G. ; 
APPLICANT: CHAW LA, Narinder K. ; BAUGHN, Mariah R. ; 
APPLICANT: WARREN, Bridget A. ; LEE, Sally; 
APPLICANT: SAN J ANWALA , Madhu S.; YAO, Monique G. ; 
APPLICANT: RAMKUMAR, Jayalaxmi; THORNTON, Michael; 
APPLICANT: GANDHI, Ameena R. ; POLICKY, Jennifer L.; 
APPLICANT: ELLIOTT, Vicki S.; ARVIZU, Chandra; 
APPLICANT: RAUMANN, Brigitte E.; BRUNS, Christopher M. ; 
APPLICANT: NAINA, Amir; HAFALIA, April J. A. ; 
APPLICANT: NGUYEN, Danniel B. ; XU, Yuming; 
APPLICANT: LU, Dyung Aina M. ; ISON, Craig H. ; 
APPLICANT: GRIFFIN, Jennifer A. ; REDDY, Roopa M. ; 
APPLICANT: BURFORD, Neil 

TITLE OF INVENTION: TRANSPORTERS AND ION CHANNELS 
FILE REFERENCE: PI-0217 USN 

CURRENT APPLICATION NUMBER: US/10/380,727 
CURRENT FILING DATE: 2003-03-14 
PRIOR APPLICATION NUMBER: PCT/US01/28938 
PRIOR FILING DATE: 2001-09-14 
PRIOR APPLICATION NUMBER: US 60/241,700 
PRIOR FILING DATE: 2000-10-18 
PRIOR APPLICATION NUMBER: US 60/240,540 
PRIOR FILING DATE: 2000-10-13 
PRIOR APPLICATION NUMBER: US 60/239,057 
PRIOR FILING DATE: 2000-10-05 
PRIOR APPLICATION NUMBER: US 60/236,882 
PRIOR FILING DATE: 2000-09-29 
PRIOR APPLICATION NUMBER: US 60/234,842 
PRIOR FILING DATE: 2000-09-22 
PRIOR APPLICATION NUMBER: US 60/232,685 
PRIOR FILING DATE: 2000-09-15 
NUMBER OF SEQ ID NOS : 52 
SOFTWARE: PERL Program 
SEQ ID NO 2 
LENGTH: 2436 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: misc_feature 

OTHER INFORMATION: Incyte ID No. US20040024183A1 7078207CD1 
US-10-380-727-2 

Query Match 99.9%; Score 12660; DB 16; Length 2436; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 2435; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MGFLHQLQLLLWKN VTLKRRSPWVLAFEI FI PLVLFFILLGLRQKKPTI SVKEVP FYTAA 60 

I | | | | | | I I I I I I III I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVSFYTAA 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 12 0 



Qy 

Db 



121 
121 



LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I M I I I I II I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I 

LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 18 0 



181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I IJ I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

361 AS GAGGAAN GT GAGAVMG PN AT AE E GAP S AAALAT P DT LQGQC S AFVQ LWAGLQ P I LC GN 420 

I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 AS GAG GAAN GT GAGAVMG PN AT AE E GAP S AAALAT P DT LQ GQC S AFVQ LWAGLQ P I LC GN 420 

421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480. 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

481 AFVGNVTHYAQVWLNTSAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 , 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' 
481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

661 VWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVI EHMMPLCMVISWVYSV 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 WIQD^ERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHmPLCMVISWWSV 720 

721 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAW 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMH 780 

781 SHWI IWLFLAVYAVAT IMFC FLVS VL YSKAKLAS ACGGI I YFLS YVP YMYVAI REEVAH 840 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 SHWI IWLFLAVYAVAT IMFC FLVS VLYS KAKLAS ACGGI I YFLS YVP YMYVAI REEVAH 840 

841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

901 MVDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 MVDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



1021 VS FLGHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I 
1021 VS FLGHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRL 1080 

1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

1141 GSRAI I LDEPTAGVDP YARRAI WDLI LKYKPGRTILLSTHHMDEADLLGDRI AI I SHGKL 1200 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 GSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 12 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

1261 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1261 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

1321 KVSEEDQSLEN'SEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 14 4 0 

1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

I I 1 1 1 I I 1 1 1 1 1 II 1 1 1 I I I 1 1 1 I I I 1 1 1 1 1 I I I i I I I I I 1 1 1 1 1 1 1 I 1 1 1 1 1 1 I I 1 1 1 1 

1501 PRGN'FIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I 

1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

1681 EYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
1681 EYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 

1741 PT YLN S LNNAI LRANLPKS KGNPAAYGI TVTNHPMNKT S ASLSLDYLLQGTDWI AI FI I 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 

1801 VAMS FVPAS FWFLVAEKSTKAKHLQFVSGCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I LF 1860 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1801 VAMS FVPAS FVVFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 1860 

1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 



Db 1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 



Qy 1921 VATFLLQLFEHDKDLKVWSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 198 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1921 VATFLLQLFEHDKDLKVWSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

Qy 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I 

Db 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

Qy 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I II I | | | | | | | | | I | | | I | | | | | | I I I I 
Db 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

Qy 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

Qy 2161 KDEARWKW7VLEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 



Qy 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

Qy 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 



Qy 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

Qy 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 



RESULT 2 
US-09-795-693-8 

; Sequence 8, Application US/09795693 

; Patent No. US20020068710A1 

; GENERAL INFORMATION: 

; APPLICANT: Glucksmann, Maria A. 

; TITLE OF INVENTION: 20685, 579, 17114, 23821, 33894, and 

; TITLE OF INVENTION: 32613, No. US20020068710Alel Human Transporters 

; FILE REFERENCE: 35800/209292 

; CURRENT APPLICATION NUMBER: US/ 09/7 95 , 693 

; CURRENT FILING DATE: 2001-02-28 

; PRIOR APPLICATION NUMBER: 60/185,906 

; PRIOR FILING DATE: 2000-02-29 

; NUMBER OF SEQ ID NOS : 42 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 8 

LENGTH: 2436 



TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-795-693-8 



Query Match 99.9%; Score 12656; DB 9; Length 2436; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 2434; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVS FYTAA 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I i I I I I I I I I I I I I I I I I I 
Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

Qy 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

Qy 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I I M I I I I I I I I I I I I I I | | | M | | | | | | | | I I I I I I I I I I I I I | | | | | | | | | | | | | | | | 
Db 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

Qy 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I | | | | | | | | | | | | | | 
Db 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

Qy 361 AS GAGGAANGT GAGAVMG PNAT AE E GAP S AAALAT P DT LQ GQ C S AFVQ LWAGLQ P I L C GN 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 3 61 AS GAGGAANGT GAGAVMG PNAT AE E GAP S AAALAT P DT LQ GQ C S AFVQLWAG LQ P I LC GN 420 

Qy 421 NRT I EPEALRRGNMS S LGFT S KEQRNLGLLVHLMTSN PKI L YAPAGS EVDRVI LKANET F 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

Qy 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | I I I I I I I I I I I I I I I I II I I I I I 
Db 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYV7VELRLHPEALNLSLDELPPA 540 

Qy 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I | | | | | | | | | | 
Db 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

Qy 601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

Qy 661 WIQDMMERAIIDTFVGHDVVEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 

I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 VWIQDMMERAIIDTFVGHDVVEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 

Qy 721 AMTIQHIVAEKEHRLKEVTVIKTMGLNNAVIIWAWFITGFVQLSISVTALTAILKYGQVLMH 780 



I I I I I I I I I I I I f 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I • I 

721 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLIH 780 

781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 

841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

901 MVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I 
901 MVDAVWGILTWYIE^VHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I 
961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

1021 VS FLGHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1021 VS FLGHNGAGKTTTMS I LTGLFP PT S GSAT I YGHDI RTEMDEI RKNLGMC PQHNVL FDRL 1080 

1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 GSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKL 1200 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 12 60 

1261 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1261 HVAS CLLVS DTSTELS YI LP S EAAKKGAFERLFQHLERS LDALHLS S FGLMDTTLEEVFL 1320 

1321 KVS EEDQS LENS EADVKES RKDVLPGAEGPAS GEGHAGNLARC S ELTQSQAS LQS AS SVG 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 



Db 


1561 


SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 162 0 


Ov 
vy 


1621 


GPF.MWTS APS T.PRT.VRF. PVR PTP^ A nrTrpQrDQ QurrwDDnwD^ArprnTT r rnTTr , T4Tvn/Q i con 
J-li ivv x o.r\.x oxjit rsxi v nij x v i ^o/^V^-L oUx o o VLrLrXltr x yMKV V 1 \jU± xj 1 Ux 1 ^rUN V o lDOU 






M 1 1 l 1 1 1 1 1 l 1 1 1 M l 1 1 1 1 1 I i 1 1 1 I I 1 1 1 1 I 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 


Db 


1621 


GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRVWGDILTDITGHNVS 1680 


Ov 
vy 


1681 


XjI xiXir i oxft\r niinnib/vi i r vxjrvol rAor Gl KAr xMVKJ\lAVKKAAy Vr iNiNiWinoJYl 1 / H U 






1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 


Db 


1681 


EYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 


Ov 

vy 


1741 


PT'YT.TJS T.MW21 T T V) ZiWT P'K'Q. ifrup a a VT" t T , \7 r P'M"Li DTuTKT'k rr P pact ct fw/t t ArrnnT r\ tt a t ttt t 1 q n n 
r x xxj1Moxj1N^>\xxjK/\in LjrixoAVjlN rA/iibl I V 1 JNJ rl r IVliN J\ 1 bAoLbLD YLLv^Gx D V V1A1 r 11 loUU 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i.i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


1741 


PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 


Ov 

vy 


i ft m 


v>uno r v rAor v vcL v/\LjJ\o 1 r\Af\rlXjVx V oGUIM ril i WLAN x VWDJYLL.N I LVxAx L.G VI 1 Lr lobU 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 


Db 


1801 


VAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 1860 


Ov 

vy 


1 ft 61 

X O \J X 


vr uxjjr/\l l or Xvtc rAVLoLr LLibWol 1 r llYll rAor Wr z> V rboAY Vr Ll VI NLr 1GITAT 1 9Z U 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


1861 


VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 


Ov 

vy 


1 Q9 1 

X ZJ Z. _L 


v>\x r xjXiVij" rjnDJ\L>Lr\V VISJ o x LAoGr Ll r rW xNLGnGLMhJYIAYNL YlNkYYAKlGQr DKM 1 3 o U 






1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


1921 


VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 


0\7 

vy 


1 Qfil 

J. _? O X 


r\or r tjWJJl v X KGLVAP<LAVrjGV VGr li-Ll lML.y YJM r liKKFyKJyir Vo I KPVEDDVDVASERQR Z U4 U 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 


Db 


1981 


KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 


Ov 

vy 


VJ 1 X 


v JUKOJJ/\iJl>luiYlVJ\Xii,lNlxil J\ V i J\oKi\1CjK1 L/\vUKLLLbVKrb£jLc CjIjLCjVNCjAGKI blr KM Z 1 UU 






M 1 1 1 1 1 1 1 1 1 fi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


2041 


VLRGDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 


Ov 

vy 


Z. X \J X 


xj! JaJJiijO i i ijVjm\n VINHarl^ Vij|\iMjxjy vy^olib i LFyLDAlir DELTAREHLQLYTRLRG1 SW ^ loU 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


2101 


LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 


Ov 
vy 


2161 


K"HP AR"\A/KTX7AT P KJ RT TTfYRnVPZVrTVQrrMWDK'T CP a t at T rVD A tpt ttt r>T? DTTriutn o v ooon 
fA.x/LVrtj\v v A.vv/\ijiLr\Xji!jXjl J\I/\lJi\x/\tjjl I oljl3JMr\Kr\ijO lAlAJLilblrAr IrljUiljFll QjJVlDJriv ZZZ\J 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


2161 


KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 


Ov 
vy 


2221 


ARRF'T.WMT.T T.DT.T KTf^R^AA/T T^H^MPPrirAT (""TDT Z\ TTv/n/Mr'PT DpTrc tout i^MDurnr OOQD 
•rvrsrsx xjv»in XjX XiUXjX rvx oao vvxil onoiuiliXliV^Ej/vLj^ i rXxiAxrl VN UKIiKL-XjCjo 1 yrllir\N Kr vjIJLj ZZoKj 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


2221 


ARRFLWNLI LDLI KTGRS WLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 


Ov 
vy 


2281 


-ii ix x vr\xr\.oo^ov r\u v v rvr r IN rvlM r x xiMi v iljA.i!ii\rlrl 1 i\vyi V-L'-^-^'tJ'rll bL/iy vc orAjYLxjy vob Z o ft U 






1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


2281 


YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 


Ov 
vy 


234 1 


v xjoxXjU x o v x X XjXVIM v e VIM r Ai\rvV »^ x'-N iiijyylj 1 Lr r jALybirLbLLLbLbKrKoAr X XjIi 






1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 i i 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

I l I I I I i M 1 1 1 II 1 II 1 1 1 1 1 1 II 1 1 II I I M I 1 1 1 1 1 M II 1 1 II 1 1 M 1 II 1 II 1 I 1 


Db 


2341 


VXGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 


Qy 


2401 


RALVADEPEDLDTEDEGLI S FEEERAQLS FNTDTLC 2436 






1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


2401 


RALVADEPEDLDTEDEGLI S FEEERAQLS FNTDTLC 2436 



RESULT 3 
US-10-156-239-8 

; Sequence 8, Application US/10156239 

; Publication No. US20030036074A1 

; GENERAL INFORMATION: 

; APPLICANT: Glucksmann, Maria A. 

; APPLICANT: Kapeller-Libermann, Rosana 

TITLE OF INVENTION: No. US2 0030036074Alel Nucleic Acid Sequences Encoding 
Human Transporters, A Human 

; TITLE OF INVENTION: ATPase Molecule, A Human Ubiquitin Hydrolase-Like 
Molecule, A Human 

; TITLE OF INVENTION: Ubiquitin Conjugating Enzyme-Like Molecule, and Uses 
Therefor 

FILE REFERENCE: 35800/247645 
; CURRENT APPLICATION NUMBER: US/10/156,239 
; CURRENT FILING DATE: 2002-05-24 
; PRIOR APPLICATION NUMBER: 09/795,693 
; PRIOR FILING DATE: 2001-02-28 
; PRIOR APPLICATION NUMBER: 60/185,906 
; PRIOR FILING DATE: 2000-02-29 
; PRIOR APPLICATION NUMBER: 09/809,557 
; PRIOR FILING DATE: 2001-03-15 
; PRIOR APPLICATION NUMBER: 60/192,018 
; PRIOR FILING DATE: 2000-03-24 
; PRIOR APPLICATION NUMBER: 09/808,568 
; PRIOR FILING DATE: 2001-03-14 
; PRIOR APPLICATION NUMBER: 60/191,790 

PRIOR FILING DATE: 2000-03-24 
; PRIOR APPLICATION NUMBER: 09/808,767 
; PRIOR FILING DATE: 2001-03-15 
; PRIOR APPLICATION NUMBER: 60/191,781 
; PRIOR FILING DATE: 2000-03-24 
; NUMBER OF SEQ ID NOS : 60 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 8 
; LENGTH: 2436 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-156-239-8 

Query Match 99.9%; Score 12656; DB 14; Length 2436; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 2434; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 



Qy 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGFLHQLQLLLWKN VTLKRRS PWVLAFE I FI PLVLFFI LLGLRQKKPT I S VKEVS FYTAA 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I 
Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 



Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy • 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 


Qy 


y ui 


Db 


901 


Qy 


961 


Db 


961 



LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 24 0 

I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I 

LAARVDPPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 24 0 

TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFS'GLSAELRNQLDVAKVSQQL 300 
I I I I I I I I I I I I I I I I I I I I I- 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 
I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

ASGAGG7\ANGTGAGAVMGPNATAEEGAPS7W\LATPDTLQGQCSAFVQLWAGLQPILCGN 420 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AS GAGGAANGTGAGAVMGPNATAEEGAPS AAALAT P DT LQGQC S AFVQLWAGLQ P I LC GN 420 

NRT I EPEALRRGNMS S LGFT S KEQRNLGLLVHLMTSNP KI L YAPAGS EVDRVI LKANET F 4 80 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 
NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

WIQDMMERAIIDTFVGHDVVEPGSWQMFPYPCYTRDDFLFVIEHiynyiPLCMVISWVYSV 720 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
VWI QDMMERAI I DT FVGHDWEPGS YVQMFPYPC YTRDDFLFVI EHMMPLCMVT SWVYSV 720 

AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISWALTAILKYGQVLMH 780 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I 



SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 8 40 

DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I 1 1 II I I IJ I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | 

DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

IWDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

MVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

I I I I I I II I I I I I I I I I i 1 1 I I I I I I HI I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I 

1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1081 TVEEHLWFYSRLKSMAQEEIRREMDKMTEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
1141 GS RAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKL 1200 

12 01 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1261 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQAS LQSASSVG 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

1441 HGLLVKRFHCARRNS KALFSQI LLPAFFVCVAMTVALSVPEI GDLPPLVLS PSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1441 HGLLVKRFHCARRNS KALFSQI LLPAFFVCVAMTVALSVPEI GDLPPLVLS PSQYHNYTQ 1500 

1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1621 GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 168 0 

1681 EYLLFT S DRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSM 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1681 EYLL FT S DRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYH SM 1740 

1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1741 PT YLN S LNNAI LRANLPKS KGNPAAYGI TVTNHPMNKTSAS LS LD YLLQGTDWI AI FI I 1800 

1801 VAMS FVPAS FWFLVAEKSTKAKHLQFVS GCNP 1 1 YWLANYVWDMLN YLVPATCCVI I L F 1860 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1801 VAMS FVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLANYWDMLNYLVPATCCVIILF 1860 
1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 



Db 



1861 



1920 



Qy 1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

Qy 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

Qy 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

Qy 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

Qy 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I 
Db 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

Qy 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 22 8 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2221 ARRFLWNLILDLIKTGRSVVLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

Qy 2281 YMITWTKSSQSVKDVVRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I'l I I I I I I I 

Db 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

Qy 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

Qy 2401 RALVADEPEDLDT.EDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 



RESULT 4 
US-10-199-485-8 

; Sequence 8, Application US/10199485 

; Publication No. US20030077626A1 

; GENERAL INFORMATION: 

; APPLICANT: Glucksmann, Maria A. 

; APPLICANT: Silos-Santiago, Inmaculada 

; TITLE OF INVENTION: 20685, 579, 17114, 23821, 33894, and 

; TITLE OF INVENTION: 32613, No. US20030077 626Alel Human Transporters 

; FILE REFERENCE: 35800/249468 

; CURRENT APPLICATION NUMBER: US/10/199, 485 

; CURRENT FILING DATE: 2002-07-18 

; PRIOR APPLICATION NUMBER: 09/795,693 

; PRIOR FILING DATE: 2001-02-28 

; PRIOR APPLICATION NUMBER: 60/185,906 

; PRIOR FILING DATE: 2000-02-29 

; NUMBER OF SEQ ID NOS : 42 



; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 8 

LENGTH: 2436 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-199-485-8 



Query Match 99.9%; Score 12656; DB 14; Length 2436; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 2434; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFILLGLRQKKPTI SVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVS FYTAA 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I U I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 



Qy 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

Qy 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 



Qy 301 GLDAPNGS DS S PQAP P PRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRT PGP P 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

Qy 361 AS GAGGAAN GT GAGAVMG PNAT AE E GAP S AAALAT P DT LQGQ C S AFVQ LWAG LQ P I LC GN 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 AS GAGGAANGT GAGAVMG PN AT AE E GAP S AAALAT P DT LQGQ C S AFVQLWAGLQ P I LC GN 420 

Qy 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

Qy 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

Qy 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDI FKGFPDEESIVNYTLNQAYQD 600 

Qy 601 NVTVFASVIFQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 



Qy 



661 



WIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



720 



661 WIQDMMERAIIDTFVGHDVVEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVI SWVYSV 720 

721 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMH 7 80 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I 

721 AMTIQHIV7VEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLIH 780 

781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 

I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 84 0 

841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 
841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

901 lWDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWAKTPRLS^ 960 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 MVDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKWKDDKKLALNKLSLNLYENQV 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 MEEDQACAMESRRFEETRGMEEEPTHLPLVVCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

1021 VS FLGHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1021 VS FLGHNGAGKTTTMS I LTGLFPPTSGSAT I YGHDI RTEMDEI RKNLGMCPQHNVLFDRL 1080 

1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

12 01 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
12 01 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 1320 

1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

1441 HGLLVKRFHCARRNSKALFSQI LLPAFFVCVAMTVALSVPEI GDLPPLVLS PSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
14 41 HGLLVKRFHCARRNSKALFSQI LLPAFFVCVAMTVALSVPEI GDLPPLVLS PSQYHNYTQ 1500 

1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 



Qy 1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

Qy 1621 G P EMWT SAP S L P RLVRE P VRCT C S AQGT G FS C P S S VGGH P PQMRWT GDI LT D I T GHNVS 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GP EMWT SAP S L P RL VREP VRCT C S AQGT GFS C P S S VGGH P PQMRWT GD I LT D I TGHNVS 168 0 

Qy - 1681 EYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I -I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 EYLLFTSDRFRLHRYGAITFGNVLKSIPAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 1740 

Qy 1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1741 PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 

Qy 1801 VAMSFVPAS FVVFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 18 60 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 1801 VAMSFVPAS FVVFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILF 1860 

Qy 1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

Qy 1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

I II I I I I I I I I I I I I I I I I I II I I I I I II I II I I I I I I I I I I I I I I II I I I I II I I I I I I 
Db 1921 VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

Qy 1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1981 K S P FEW D I VT RGL VAMAVE G WG FL LT I MCQ YN FL RR P Q RM P VS T K P VE D D VD VAS E RQ R 2040 

Qy 2041 VLRGDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVRPGEC FGLLGVNGAGKTSTFKM 2100 

I I I I I I I I I I I I I I I.I I I I I I I I I U I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2041 VLRGDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

Qy 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 

Qy 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2161 KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 

Qy 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I 

Db 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

Qy 2281 YMITVRTKSSQSVKDVVRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2281 YMITVRTKSSQSVKDVVRFFNRNFPE^LKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 



Qy 

Db 



2341 
2341 



VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 
I I I I I I I I I I I I I I I I I I I II I I I I II II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 



2400 
2400 



Qy 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 



RESULT 5 
US-10-072-621-8 

Sequence 8, Application US/10072621 
Publication No. US20020169137A1 
GENERAL INFORMATION: 
APPLICANT: Reiner, Peter B. 
APPLICANT: Connop, Bruce P. 
APPLICANT: Pollard, Michelle 

TITLE OF INVENTION: REGULATION OF AMYLOID PRECURSOR PROTEIN EXPRESSION 
TITLE OF INVENTION: BY MODIFICATION OF ABC TRANSPORTER EXPRESSION OR 
ACTIVITY 

FILE REFERENCE: 100103.402 

CURRENT APPLICATION NUMBER: US/10/072,621 
CURRENT FILING DATE: 2002-02-08 
NUMBER OF SEQ ID NOS : 10 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 8 
LENGTH: 2001 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: VARIANT 

LOCATION: 30, 70, 280, 477, 558, 1471, 1651, 1689, 1724 
OTHER INFORMATION: Xaa = Any Amino Acid 
FEATURE : 

NAME/ KEY: VARIANT 

LOCATION: 30, 70, 280, 477, 558, 1471, 1651, 1689, 1724 
'OTHER INFORMATION: Xaa = Any Amino Acid 
US-10-072-621-8 

Query Match 80.9%; Score 10249; DB 13; Length 2001; 

Best Local Similarity 98.5%; Pred. No. 0; 

Matches 1973; Conservative 2; Mismatches 26; Indels 2; Gaps 2; 

Qy 434 MSSLGFTSKEQRNLGL LVH LMT S N P K I L YAP AG S E VD RVI L KAN ET FAFVGNVT H Y AQ VW 493 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MSSLGFTSKEQRNLGLLVHLMTSNPKILYXPAGSEVDRVILKANETFAFVGNVTHYAQVW 60 

Qy 494 LNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFSLPSGMA 553 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 LNISAEIRSXLEQGRLQQHLRWLQQYVAELRPHPEALNLSLDELPPALRQDNFSLPSGMA 120 

Qy 554 LLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFASVI FQTR 613 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 LLQQLDTIDNAPCGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAGVIFQTR 180 

Qy 614 KDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGFVWIQDMMERAIID 673 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 KDGSLPPHVH YKI RQNS S FTEKTNEI RRAYWRPGPNTGGRFYFLYGFVWIQDMMERAI I D 240 



Qy 



674 



TFVGHDWEPGSYVQMFPYPCYTRDDFLFVI EHMMPLCMVISWVYSVAMTIQHIVAEKEH 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 



733 



Db 


241 


TFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHl^PLCMXISWVYSVAMTIQHIVAEKEH 


300 


Qy 




DT VPI/MVTMrT MM AA/TJTAT\7'ftTA7FT T^r* TT'WOT C T QA/T 1 a T T A T T VVCCWH MH ^ T-TVA/T TWT.FT. AVY 

KJ_ii\Ej Viyij\lMoijiNJN/\VrlVV VAW r 1 ibf vyljol o V J./VL i/\± ijJa i v j_ii v in o n v viiyviir ju/\v i 


793 






1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 




Db 


301 


RLKEVMKTMGLNNAVHWVAW FI T GFVQL S I S VT ALTAI LK YGQ VLMHS HWI I WL FLAVY 


360 


Qy 


l 94 


7\ -i tt\ rri t Ti/tTP/" 1 ITT 1 7" CM fT VOVAVT ACAPr^TT VTTT C\A/DVTufV\ T 7\ T DCTUflWRl^TT A TTTPVT* T A C. T 

AVAI INr Lr LVbVLioJ\AKiiAbAl.bbll ir LbiVFiMiVAl KJLlijVAil]Jr\± l/\r rL]\^±/\olj 


ft 

O JO 






i M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 




Db 


361 


AVATIMFCFLVSVLYSKAKLASA-GGIIYFLSYVPYMYVAI REEVAHDKITAFEKCIASL 


419 


Qy 


854 


MSTTAFGLGSKYr AliYhVAGVGiyWHl t oQb P VbGDDr NLLIjAVIMJ^VJJAV V iblliiw i 


Q1 ^ 

3» ± J 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I.I M 




Db 


420 


MSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLMVDAWYGILXWY 


479 


Qy 


914 


IEAVHPGMYGLPRPWYr PLQKo YWLGbGKl KAWhWbWPWAKI PKLb VMbbDyAUAIYLtDKK 


y / o 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


480 


IEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRR 


539 


Qy 


974 


FEETRGMEEEPTHLPLVVCVDKLTrWYKDDK LGnJNbrAbrl\l 1 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


540 


FEETRGMEEEPTHLPLWXVDKLTKVYKDDKKLALNKLSLNLYENQGVSFLGHNGAGKTT 


599 


Qy 


1034 


TMSILTGLFPPTSGSATIYGHDIRrENDElRKNLGMuF\2rlN VLf DKIjI VttinLiWr ioKLrv 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


600 


TMSILTGLFPPTSGSATIYGHDIRTEMDEIRKN-GHVPQHNVLFDRLTVEEHLWFYSRLK 


658 


Qy 


1094 


SMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLbVAlAh VGGoKAl±l»DEir 1 Abr 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


659 


SMAQEEI PREMDKMI EDLELSNKRHSLVQTLSGGMKRKVSVAIAFVGGSRAI I LDEPTAG 


718 


Qy 


1154 


VDPYARRAIWDLILKYKPGRTILLSTHHMDEADLL^ LKG1 Y 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


719 


VDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTY 


778 


Qy 


1214 


GDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQt IRKHVAbCLLVbDl bi 


iz / O 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


779 


GDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTST 


838 


Qy 


1274 


T-i t (~i \ r t T nPPT\ A ISIS" 7\ TTTT* Ti T TTi/^TT T PDCT HAT TT T O O TT 7 /^ T TVJfT> r P r P T XT' TTT rTT'T TA ?C PPflACT PXTPP 

ELSYILPSEAAKKGAFERLFQnLERoLDALHLbbt GJLMD1 1 Lhr^vr liKVoHjrjDyoLUjlM ot 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


839 


ELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSGGDQSLENSG 


898 


Qy 


1 O O A 


ADVKhoRKD VijrGAJbGFAoGl^GnAGiNliAKUo 1 i-^V 


J. O -7 >J 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


899 


ADVKESRKDVLPGAEGHASGEGHAGNLARCSELTQSQASLQSASSVGSALGDEGAGYTDV 


958 


Qy 


1 O Q A 


vrnvnrtT t-»t^>t t^/^t^ t->t\>.t-\ tc< t r\rp-\ tt? tv tt 1 a t o t>"\ tc* r\c c dt/t Ti^/^t*7T v\ riyr~\ f T-I^'T t "\ T~W "D TPT T f~" ADD 
YGDYRPLr DWr^DPDN VbLybVnAEALbKVGyGDKJ\L rHaLili Vr\K£ nLAKK 


1 4 

JO 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


959 


YGDYPPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARR 


1018 


Qy 


1 A C. A 

14 3 4 


1,1 O 1/ A T C'CATT T D A TTTTl 7T*t /AA/TPA 7"A T CT 7TD C T FiT DDT "\ 7"T CDQnVUMVTnDDriVlPTDVZVMPPtJ 

NoixAijr oyi LiLir Ac r vUVAiyil V/VUiD Vr EjI LjUIjIt FiiVLiO i: I nlN I lytrKblN r ± rr I/\lMCjEjjr; 


1 SI ^ 






i i i i i i i i i i i i i i i i i i i i i i i i i i t t i i i i i i i i i i i I I I I I I I I I I I I I I t 1 I I I I I 
1 II 1 II II 11 II II 1 1 1 1 1 II II 1 1 M 1 1 1 1 1 II 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1019 


NSKALFSQILLPAFEVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPY7\NEER 


1078 


Qy 


1514 


REYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSP7^GSLGPTLNLSSGESRLIiAARFFD 


1573 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1079 


REYRLRLS PDAS PQQLVST FRLP S GVGATCVLKS PANGS LGPTLNLS SGES RLLAARFFD 


1138 



Qy 


1574 


Db 


1139 


Qy 


1634 


Db 


1199 


Qy 


1694 


Db 


1259 


Qy 


1754 


Db 


1319 


Qy 


1814 


Db 


1379 


Qy 


1874 


Db 


1439 


Qy 


1934 


Db 


1499 


Qy 


1994 


Db 


1559 


Qy 


2054 


Db 


1619 


Qy 


2114 


Db 


1679 


Qy 


2174 


Db 


1739 


Qy 


2234 


Db 


1799 


yy 




Db 


1859 


Qy 


2354 


Db 


1919 



SMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPR 1633 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | || I I I I I I I I I I I 

SMC LESFTQGLPLSN EVP P P P S PAP S D S PAS P DE D LQ AWN VS L P P T AGQ EMWT SAP S L P R 1198 

LVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH 1693 

N Ml II I I I I M I Mill 1:1 IN I I MMMI IIIMII I I II I II I M I 

LVREPVRCTCSAQGTGFSCPNSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH 1258 

RYGAI TFGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKG YHSMPT YLN S LNNAILR 1753 

IIIIH II I I I I I I I M M I Ml I M MM I I I I I I I I I I M I 

RYGAI TFGNVLKS I PAS FGTRAP PMVRKI RCARAAQVFYNNKG YHSMPT YLN S LNNAI LR 1318 

ANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSFVPASFWF 1813 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



LVAEKSTKAKHLQFVSGCNP 1 1 YWLAN YVWDMLNYLVPATCCVI I LFVFDLPAYT S PTNF 1873 

I I I I I I I I I I I I M I I I I I I I I I I Ml Ml II 

LV7VEKSTKAKHLQFVSGCNPI I YW LAN YVWDMLNYLVPATCCVI ILFVFDLPAYTSPTNF 1438 

PAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDK 1933 

Ml I I I I I I I I I I I I I II I II I M I I M I I II I M I I I II I I I I I I I M I II II II I II 

PAVLSLFLLYGWSITPIMYPASFWFEVPSSAYXFLIVINLFIGITATVATFLLQLFEHDK 14 98 

DLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGL 1993 

I I I I I I I I II I I II I Ml I M I I I M I I II I I I M I I I I I I I I I I I I I | I | M I 

DLKWNS YLKSCFLI FPNYNLGHGLMEMAYNEYINEYYAKI GQFDKMKS PFEWDI VTRGL 1558 

VAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMVKI 2 053 

I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

VAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMVKI 1618 
ENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAF 2113 

I I I I I I I I I M I I I I II I I I I M II II II I II I I I II M M II I I I M M M II 

ENLTKVYKSRKIGRILAVDRLCLGVRPGECFGXLGVNGAGKTSTFKMLTGDESTTGGEAF 1678 

VNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEK 2173 
I M I I I I I M Ml I I I I I I I I I I I I I I I I I I I I I I I | | | | I I I I M I I I I I I I II M I 
VNGHSVLKELXQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGIXWKDEARWKWALEK 1738 

LELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLI 2233 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

LELTKYADKPAGT YSGGNKRKLSTAI ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI 1798 



I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



KDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTT 2353 
I I M I I I I I I I I I I I I I I I I I I I I I I I I | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I II 
KDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTT 1918 

LDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDT 2413 
I I M I I I I I I I I I M M I M I I I I II I II I I I II I II | M II M II I II II I I I I M I M 
LDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPEDLDT 1978 



Qy 2414 EDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I II I I I I I I II I I II I 
Db 197 9 EDEGLISFEEERAQLSFNTDTLC 2001 



RESULT 6 

US-10-297-022-18 

Sequence 18, Application US/10297022 
Publication No. US20030216310A1 
GENERAL INFORMATION: 
APPLICANT: INCYTE GENOMICS, INC. 
APPLICANT: THORNTON, Michael 
APPLICANT: WALIA, Narinder K. 
APPLICANT: YUE, Henry 
APPLICANT: NGUYEN, Danniel B. 
APPLICANT: LAL, Preeti 
APPLICANT: GANDHI , Ameena R. 
APPLICANT: TRIBOULEY, Catherine M. 
APPLICANT: YAO, Monique G. 
APPLICANT: RAMKUMAR, Jayalaxmi 
APPLICANT: AU-YOUNG, Janice 
APPLICANT: LU, Yan 
APPLICANT: TANG, Y. Tom 
APPLICANT: AZIMZAI, Yalda 
APPLICANT: BRUNS, Christopher M. 
APPLICANT: GRIFFIN, Jennifer A. 
APPLICANT: YANG, Junming 
APPLICANT: BAUGHN, Mariah R. 
APPLICANT: SANJANWALA, Madhu S. 
APPLICANT: RAUMANN, Brigitte E. 
APPLICANT: LEE, Ernestine A. 
APPLICANT: HAFALIA, April 
APPLICANT: GREENE, Barrie D. 
APPLICANT: KHAN, Farrah A. 
APPLICANT: KEARNEY, Liam 
APPLICANT: ELLIOTT, Vicky S. 
APPLICANT: SEILHAMER, Jeffrey J. 
APPLICANT: POLICKY, Jennifer L. 
APPLICANT: BOROWSKY, Mark L. 
APPLICANT: BURFORD, Neil 
APPLICANT: DING, Li 
APPLICANT: LU, Dyung Aina M. 
APPLICANT: HILLMAN, Jennifer L. 

TITLE OF INVENTION: TRANSPORTERS AND ION CHANNELS 
FILE REFERENCE: PI-0109 PCT 
CURRENT APPLICATION NUMBER: US/ 10/2 97 , 022 
CURRENT FILING DATE: 2002-11-25 

PRIOR APPLICATION NUMBER: 60/208,424; 60/209,001; 60/210,588; 60/212,335; 
60/213,747; 60/215,391 

; PRIOR FILING DATE: 2000-05-26; 2000-06-01; 2000-06-08; 2000-06-16; 2000-06- 
22; 2000-06-29 

NUMBER OF SEQ ID NOS : 54 
SOFTWARE: PERL Program 
SEQ ID NO 18 
LENGTH: 1771 
TYPE: PRT 

ORGANISM: Homo sapiens 



FEATURE : 

NAME/ KEY: misc_f eature 

OTHER INFORMATION: Incyte ID No. US20030216310A1 2311751CD1 
US-10-297-022-18 



Query Match / 72.9%; Score 9237; DB 15; Length 1771; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1771; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


666 


MMERAI I DTFVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVI SWVYSVAMTIQ 


725 






1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 1 M MM 1 




Db 


1 


MMERAI I DTFVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVI SWVYSVAMTIQ 


60 


Qy 


726 


HIVAEKEHRLKEVT^KTMGLNNAVWVAWFITGFVQLSISWALTAILKYGQVmHSHWI 


785 






M M I M M M M M M M 1 1 M M M M M M 1 M M M M M 1 M M M M M M 1 M 




Db 


61 


H I VAE KEH RL K E VMKTMG LNN AVHWVAW F I T G FVQ LSI S VT ALT AI L K YGQVLMH S HWI 


120 


Qy 


786 


I WL FLAVYAVAT I M FC FL VS VL Y S KAKLAS AC GG 1 1 Y FL S YVP YM YVAI RE E VAH D K I T A 


845 






IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIM'IIIIIIIIIIIIIIIII 




Db 


121 


IWLFLAWAVATIMFCFLVSVXYSKAKLASACGGIIYFLSYVPYMYVAI REEVAHDKITA 


180 


Qy 


846 


FEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLMVDAV 


905 






IIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 




Db 


181 


FE KC I AS LMS T T AFGL G S K Y FAL YEVAGVG I QWHT F S Q S P VE G DD FN L L LAVTMLMVDAV 


240 


Qy 


906 


VTGILTWYIEAV11PGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQ 


965 






1 1 M 1 1 1 M 1 1 1 1 1 I 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 




Db 


241 


VTGILTWYIEAVIIPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWAI^TPRLSVKEEDQ 


300 


Qy 


966 


ACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLG 


1025 






1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 M 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 




Db 


301 


ACAMESRRFEETRGMEEEPTHLPLWCVT)KLTKWKDDKKL7VLNKLSLNLYENQWSFLG 


360 


Qy 


1026 


HNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEH 


1085 






1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


361 


HNGAGKTTTMSILTGLFPPTSGSATI YGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEH 


420 


Qy 


108 6 


LWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAI 


1145 






M 1 II 1 1 1 1 II 1 1 1 II 1 1 1 II 1 M 1 1 1 1 1 M 1 1 1 1 1 1 II M II 1 1 1 1 II M 1 1 1 1 1 M 1 1 




Db 


421 


LWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAI 


480 


Qy 


1146 


ILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGS 


1205 






1 1 1 M 1 II II 1 II II 1 II II 1 1 1 1 II II 1 1 II 1 II 1 1 1 1 II II II II M 1 II 1 1 1 1 II M 




Db 


481 


ILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGS 


540 


Qy 


1206 


PLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASC 


1265 






1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II II 1 II II 1 1 II II 1 II II II 1 1 II 1 1 1 1 II 1 1 II 1 M 1 




Db 


541 


PLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASC 


600 


Qy 


1266 


LLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGmDTTLEEVFLKVSEE 


1325 






1 1 1 1 1 II 1 II 1 II 1 1 1 1 II II II 1 II 1 1 1 1 1 1 II 1 1 1 M II II M II II 1 1 1 1 II II 1 II 




Db 


601 


LLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEE 


660 



Qy 

Db 



1326 
661 



1385 



720 



1386 EGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLV 1445 

I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

721 EGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLV 780 

1446 KRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNF 1505 
I I I I I I I I I I I I I II I I I II i I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
781 KRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNF 840 

1506 IPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESR 1565 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 I PYANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPTLNLSSGESR 900 

1566 LLAARFFDSMCLESFTQGLPLSNFVPPPPS PAPS DS PAS PDEDLQAWNVSLPPTAGPEMW 1625 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I 

901 LLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDS PAS PDEDLQAWNVSLPPTAGPEMW 960 

1626 TSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLF 1685 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I 
961 T SAP S L P RL VRE P VRCT C S AQ GTG FS C P S S VGGH P PQMRWT GD I LT D I T GHNVS E YL L F 1020 

1686 T S DRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSMPT YLN 17 4 5 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1021 TSDRFRLHRYGAITFGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSMPTYLN 1080 

1746 SLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDVVIAIFIIVAMSF 1805 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1081 SLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSF 1140 

1806 VPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLP 1865 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1141 VP AS FWFLVAEKSTKAKHLQFVSGCNPI I YWLAN YVWDMLNYLVPATCCVI I LFVFDLP 1200 

1866 AYT S PTNFPAVLS LFLL YGWS I T P IMYPAS FWFEVP S S AYVFLI VI NLFI GI TAT VAT FL 1925 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1201 AYT S P TN FP AVLS L FLL YGW S I T P I MY PAS FW FEVP S S AYVFL I VI N L F I G I TAT VAT FL 1260 

1926 LQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFE 1985 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
1261 LQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFE 1320 

1986 WDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGD 2045 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 WDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGD 1380 

2046 ADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDE 2105 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1381 ADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDE 1440 

2106 STTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEAR 2165 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1441 STTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEAR 1500 

2166 VVKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFI FLDEPTTGMDPKARRFL 2225 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
1501 WKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFI FLDEPTTGMDPKARRFL 1560 

2226 WNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITV 2285 



Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



1561 



I I I I I I I I I I I I -I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

WNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITV 



1620 



2286 RTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIE 2345 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1621 RTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIE 1680 

2346 DYSVSQTTLDNVFWFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVA 24 05 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1681 DYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVA 1740 

2406 DEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1741 DEPEDLDTEDEGLISFEEERAQLSFNTDTLC 1771 



RESULT 7 

US-10-340-097-119 

Sequence 119, Application US/10340097 
Publication No. US2003016227 6A1 
GENERAL INFORMATION: 
APPLICANT: Rattner, Amir 
APPLICANT: Sun, Hui 
APPLICANT: Lupski, James R. 
APPLICANT: Nathans, Jeremy " 
APPLICANT: Anderson, Kent L. 
APPLICANT: Leppert, Mark 
APPLICANT: Dean, Michael 
APPLICANT: Singh, Nanda 

APPLICANT: Shroyer, No. US20030162276Alh F. 
APPLICANT: Smallwood, Philip M. 
APPLICANT: Allikmets, Rando 
APPLICANT: Lewis, Richard A. 
APPLICANT: Li, Yixin 

TITLE OF INVENTION: Nucleic Acid And Amino Acid Sequences For ATP-Binding 
Cassette 

; TITLE OF INVENTION: Transporter And Methods Of Screening For Agents That 
Modify ATP-Binding Cassette 

TITLE OF INVENTION: Transporter 
FILE REFERENCE: BYLR0065 

CURRENT APPLICATION NUMBER: US/ 10/34 0, 097 
CURRENT FILING DATE: 2003-01-10 
PRIOR APPLICATION NUMBER: US/09/032 , 438A 
PRIOR FILING DATE: 1998-02-27 
PRIOR APPLICATION NUMBER: 60/039,388 
PRIOR FILING DATE: 1997-02-27 
NUMBER OF SEQ ID NOS : 120 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 119 
LENGTH: 1472 
TYPE: PRT 
ORGANISM: Mouse 
US-10-340-097-119 



Query Match 56.2%; Score 7117; DB 14; 

Best Local Similarity 94.2%; Pred. No. 0; 
Matches 1388; Conservative 22; Mismatches 60; 



Length 1472; 
Indels 4; Gaps 



4; 



965 QACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFL 1024 

MINIM M Mill I II III II I I MINI I I llhl I! II II I | | || | | Ml | | | | | 

1 QACAMESRHFEETRGMEEEPTHLPLWCVDKLTKVYKNDKKLALNKLSLNLYENQWSFL 60 

1025 GHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEE 1084 

M M I I I I I I I I I I II I II I I I I || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 

61 GHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEE 120 

1085 HLWFYSRLKSMAQEEI RREMDKMI EDLELSNKRHSLVQTLSGGMKRKLS VAI AFVGGS RA 1 1 4 4 

IMMIIMI 11:1 MM I I I I III I III Ml I I I II II I I I I I I I II I 

121 HLWFYSRLKSMAQEEI RKETDKMI EDLELSNKRHSLVQTLSGGMKRKLS VAI AEVGGSRA 180 

1145 II LDEPTAGVDPYARRAI WDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKLKCCG 1204 

N II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | M I I I I I I I II I I M I I I I I 

1 8 1 I I LDEPTAGVDPYARRAI WDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKLKCCG 24 0 

1205 SPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVAS 1264 
M I II I I I II I I I I I I I : | | | M I I I I I II I I I | | | I I I I I I I I I I II I I I 
241 SPLFLKGAYKDGYRLTLVKQPAEPGTSQEPGLASSPSGCPRLSSCSEPQVSQFIRKHVAS 300 

1265 CLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSE 1324 

> M I I M I II II I I I I I I I I I I || | | | | | | | | | | | | | | | | | | | | || | | | | 

301 SLLVSDTSTELSYILPSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFLKVSE 360 

1325 EDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARG 1384 
N II I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I II II I I I I I I I I I I I I I 
361 EDQSLENSEADVKESRKDVLPGAEGLTAVGGQAGNLARCSELAQSQASLQSASSVGSARG 42 0 

1385 DEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLL 1444 

Ml MM III I IMMIIMI I II MM I Mill Ml I I MM I I M I I 

421 EEGTGYSDGYGDYRPLFDNLQDPDNVSLQEAEMEALAQVGQGSRKLEGWWLKMRQFHGLL 480 

1445 VKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGN 1504 

N M M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

481 VKRFHCARRNSKALCSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGN 540 

1505 FI PYANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGSLGPTLNLS S GES 1564 

M I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

541 FIPYANEERQEYRLRLSPDAS PQQLVSTFRLPSGVGATCVLKS PANGSLGPMLNLSSGES 600 

1565 RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDED-LQAWNVSLPPTAGPE 1623 

N M M MN I I I II I I I II I I I I | | | | || || | | | | M | | | | | | | | I : I I I I I I I I I 

601 RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPVKPDEDSLQAWNMSLPPTAGPE 660 

1624 MWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYL 1683 
N I I I I I I I I I I I I I I II I II I I I I | | | | | | | | | | | | | | | | | | | || | | | | | | | | | | | | 
661 TWTSAPSLPRLVHEPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYL 720 

1684 LFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTY 1743 
N II I I I II I I I I I I II I I I I I I I I I I I I I I I I I I | I I I I III I I I I I I I I I I I I 
721 LFTSDRFRLHRYGAITFGNVQKSIPASFGARVPPMVRKIAVRRVAQVLYNNKGYHSMPTY 780 

1744 LNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAM 1803 

I I I I I I I I I I I I I M I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 LNSLNNAILRT^NLPKSKGNPAAYKITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAM 840 



Qy 1804 S FVPAS FWFLVAEKSTKAKHLQFVS GCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I LFVFD 1863 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 S FVPAS FWFLVAEKSTKAKHLQ FVS GCN PVI YWLAN YVWDMLN YLVPATCCVI I LFVFD 900 

Qy 1864 L P AYT S PTN FP AVL S L FL L YGW S I T P IMY PAS FW FEVP S S AYVFL IVINLFIGI TAT VAT 1923 

I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 L P AYT S PTN F P AVL S L FL L YGW S I T P I MY PAS FW FEVP S S AYVFL IVINLFIGI TAT VAT 960 

Qy 1924 FLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSP 1983 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 FLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSP 1020 

Qy 1984 FEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLR 2043 

I I I I I I I I I I I I I III III I I I I I I I II I I : I I I : I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 FEWDIVTRGLVAMTVEGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQRVLR 1080 

Qy 2044 GDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGV-RPGECFGLLGVNGAGKTSTFKMLT 2102 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 GDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVCVPGECFGLLGVNGAGKT ST FKMLT 1140 

Qy 2103 GDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKD 2162 

I I II I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III 
Db / 1141 GDESTTGGEAFVNGHSVLKDLLQVQQSLGYCPQFDVPVDELTAREHLQLYTRLRCIPWKD 1200 

Qy 2163 EARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKAR 2222 

I I : I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 1201 EAQWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKAR 1260 

Qy 2223 RFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYM 2282 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1261 RFLWNLILDLIKTGRSWLTSHSMEECE7VLCTRLAIMVNGRLHCLGSIQHLKNRFGDGYM 1320 

Qy 2283 ITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVL 2342 

I I I I I I I I I : I I I I I I I I I I I I I I I : : , I I I I I I I I I I I I I I I I I I I I I I I III 
Db 1321 ITVRTKSSQNVKDVVRFFNRNFPEAHAQGKTPYKVQYQLKSEHISLAQVFSKMEQVVGVL 1380 

Qy 2343 GIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRA 2402 

I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I : I I I I I I I I I I I I I I I I I I I I . 
Db 1381 GIEDYSVSQTTLDNVFVNFAKKQSDNVEQQEAE-PSSLPSPLG-LLSLLRPRPAPTELRA 14 38 

Qy 2403 LVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1439 LVADEPEDLDTEDEGLI S FEEERAQLS FNTDTLC 1472 



RESULT 8 

US-10-336-215-119 

Sequence 119, Application US/10336215 
Publication No. US20030170852A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Allikments, Rando 
Anderson, Kent L. 
Dean, Michael 
Leppert, Mark 
Lewis, Richard A. 
Li, Yixin 
Lupski, James R. 



APPLICANT: Nathans, Jeremy 
APPLICANT: Rattner, Amir 

APPLICANT: Shroyer, No. US20030170852Alh F. 
APPLICANT: Singh, Nanda 
APPLICANT: Smallwood, Philip 
APPLICANT: Sun, Hui 

TITLE OF INVENTION: Methods Of Screening And Diagnostics Using ATP-Binding 
Cassette 

TITLE OF INVENTION: Transporter 
FILE REFERENCE: APPI0089 

CURRENT APPLICATION NUMBER: US/ 10/336, 2 15 
CURRENT FILING DATE: 2003-04-11 
PRIOR APPLICATION NUMBER: 60/039,388 
PRIOR FILING DATE: 1997-02-27 
PRIOR APPLICATION NUMBER: 09/032,438 
PRIOR FILING DATE: 1998-02-27 
NUMBER OF SEQ ID NOS : 120 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 119 
LENGTH: 1472 
TYPE: PRT 
ORGANISM: Mouse 
US-10-336-215-119 



Query Match 56.2%; Score 7117; DB 14; Length 1472; 

Best Local Similarity 94.2%; Pred. No. 0; 

Matches 1388; Conservative 22; Mismatches 60; Indels 4; Gaps 4; 

Qy 965 QACAMESRRFEETRGMEEEPTHLPLWCVT)KLTKVTKDDKKLALNKLSLNLYENQVVSFL 1024 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I 
Db 1 QACAMESRHFEETRGMEEEPTHLPLWCVDKLTKVYKNDKKLALNKLSLNLYENQWSFL 60 

Qy 1025 GHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEE 1084 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 GHNGAGKTTTMS I LTGLFPPTSGSATI YGHDI RTEMDEI RKNLGMCPQHNVLFDRLTVEE 120 

Qy 1085 HLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRA 114 4 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 HLWFYSRLKSMAQEEIRKETDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRA 180 

Qy 1145 I I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCG 1204 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 1 1 LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCG 24 0 



Qy 12 05 SPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVAS 1264 

I I I I I I I I II I I I II I I : I II I I 111111111 I I I I I II I I I I I I I I I I I I 
Db 241 SPLFLKGAYKDGYRLTLVKQPAEPGTSQEPGLASSPSGCPRLSSCSEPQVSQFIRKHVAS 300 

Qy 12 65 CLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSE 1324 

I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 SLLVSDTSTELSYILPSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFLKVSE 360 



Qy 1325 EDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARG 1384 

I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I II I I I I I I I I I I I I II I I I I I I 
Db 361 EDQSLENSEADVKESRKDVLPGAEGLTAVGGQAGNLARCSELAQSQASLQSASSVGSARG 420 

Qy 1385 DEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLL 1444 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



421 



: I I I I : I I II I I II II I I I I I I I I I I I I I II :: I I I I I I I I : I I I I : I I I I I I I 

EEGTGYSDGYGDYRPLFDNLQDPDNVSLQEAEMEALAQVGQGSRKLEGWWLKMRQFHGLL 



480 



1445 WRFHCARRNS KALFSQI LLPAFFVCVAMTVALSVPEI GDLP PLVLS PSQYHNYTQPRGN 1504 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I 
4 81 VKRFHCARRNSKALCSQILLPAFFVCVAMTVALSVPEIGDLPPLVLS PSQYHNYTQPRGN 54 0 

1505 FIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGES 1564 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 FIPYANEERQEYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPMLNLSSGES 600 



1565 



601 



1624 



RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDED-LQAWNVSLPPTAGPE 1623 
I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | : | | | | | | | | | 
RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPVKPDEDSLQAWNMSLPPTAGPE 660 



MWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYL 1683 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 TWT SAP S L P RL VH E P VRCT C S AQ GT G FS C P S S VGGH P PQMRWTGD I LT D I T GHNVS E Y L 720 

1684 L FT S DRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVF YNN KG YH SMPT Y 1743 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I II I 
721 LFTS DRFRLHRYGAI T FGNVQKS I PAS FGARVP PMVRKI AVRRVAQVL YNNKGYH SMPT Y 78 0 

1744 LNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAM 1803 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 LNSLNNAILRANLPKSKGNPAAYKITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAM 840 

1804 SFVPAS FVVFLVAEKSTK7U<HLQFVSGCNPI I YWLANYVWDMLNYLVPATCCVI I LFVFD 1863 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 SFVPASFWFLVAEKSTKAKHLQFVSGCNPVIYWLANYVWDMLNYLVPATCCVIILFVFD 900 

1864 L P AYT S P TN F P AVL SLFLLYGWSITPI MY PAS FW FE VP S S AYVFL I VI N L F I G I TAT VAT 1923 
I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . 
L P AYT S PTN F PAVL S L FLL YGW S I T P I MY PAS FW FEVP S S AYVFL I VI N L F I G I TAT VAT 



901 



1924 



960 



FLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSP 1983 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
961 FLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSP 1020 

1984 FEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLR 2043 

I I I I I I I I I I I I I III II I I I I I I I I I I I I : I I I : I II I I I I I I I I I I I I I I I I I I I 
1021 FEWDIVTRGLVAMTVEGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQRVLR 1080 

2044 GDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGV- RPGECFGLLGVNGAGKT ST FKMLT 2102 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
1081 GDMNDMVKIENLTKVYKSRKIGRILAVDRLCLGVCVPGECFGLLGVNGAGKTSTFKMLT 1140 

2103 GDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKD 2162 

II I I I I I I I I I I I II II I I : I I I I I II II I I I I I I I I I I I I I I I I I I I I I I III 
1141 GDESTTGGEAFVNGHSVLKDLLQVQQSLGYCPQFDVPVDELTAREHLQLYTRLRCIPWKD 1200 

2163 EARVVKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKAR 2222 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1201 EAQWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKAR 1260 

2223 RFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYM 2282 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 



Db 



1261 RFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLHCLGSIQHLKNRFGDGYM 1320 



Qy 22 83 I TVRTKS S QS VKDVVRFFNRNFPEAMLKERHHTKVQYQLKS EHI S LAQVFS KMEQVS GVL 2342 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1321 I T VRT KS S QNVKD WRF FN RN F P EAHAQ GKT P YKVQ YQ L K S E H I S LAQVF S KMEQ WGVL 1380 

Qy 2343 GIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRA 2402 

I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I : I I I I I I I I I I I I I I I I I I I I 
Db 1381 GIEDYSVSQTTLDNVFWFAKKQSDNVEQQEAE-PSSLPSPLG-LLSLLRPRPAPTELRA 1438 

Qy 2403 LVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1439 LVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 1472 

RESULT 9 

US-10-336-219-119 

Sequence 119, Application US/10336219 
Publication No. US20030170853A1 
GENERAL INFORMATION: 
APPLICANT: Allikmets, Rando 
APPLICANT: Anderson , Kent L. 
APPLICANT: Dean, Michael, 
APPLICANT: Leppert, Mark 
APPLICANT: Lewis, Richard A. 
APPLICANT: Li, Yixin 
APPLICANT: Lupski, James R. 
APPLICANT: Nathans, Jeremy 
APPLICANT: Rattner, Amir 

APPLICANT: Shroyer, No. US20030170853Alh F. 
APPLICANT: Singh, Nanda 
APPLICANT: Smallwood, Philip 
APPLICANT: Sun, Hui 

TITLE OF INVENTION: Methods Of Gene Therapy Using Nucleic Acid Sequences For 
TITLE OF INVENTION: ATP-Binding Cassette Transporter 
FILE REFERENCE: BYLR0072 

CURRENT APPLICATION NUMBER: US/10/336, 219 
CURRENT FILING DATE: 2003-01-03 
PRIOR APPLICATION NUMBER: 60/039,388 
PRIOR FILING DATE: 1997-02-27 
PRIOR APPLICATION NUMBER: 09/032,438 
PRIOR FILING DATE: 1998-02-27 
NUMBER OF SEQ ID NOS: 120 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 119 
LENGTH: 1472 
TYPE: PRT 
ORGANISM: Mouse 
US-10-336-219-119 

Query Match 56.2%; Score 7117; DB 14; Length 1472; 

Best Local Similarity 94.2%; Pred. No. 0; 

Matches 138 8; Conservative 22; Mismatches 60; Indels 4; Gaps 4; 

Qy 965 QACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFL 1024 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I 
Db 1 QACAMESRHFEETRGMEEEPTHLPLWCVDKLTKVYKNDKKLALNKLSLNLYENQWSFL 60 



1025 GHNGAGKTTTMSILTGLFPPTSGSATIYGHMRTEMDEIRKNLGMCPQHNVLFDRLTVEE 1084 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 GHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEE 120 

1085 HLWFYSRLKSMAQEEI RREMDKMI EDLELSNKRHSLVQTLSGGMKRKLSVAI AFVGGSRA 1144 

I II "l I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 HLWFYSRLKSMAQEEI RKETDKMI EDLELSNKRHSLVQTLSGGMKRKLSVAI AFVGGSRA 180 

114 5 II LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKLKCCG 1204 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 I I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKLKCCG 24 0 

1205 SPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVAS 12 64 
I I I I I I I I I I I I I I I I I : I I I 1 I I I I I I I II I I I I I I I I I I I I I I I I I II I 
241 SPLFLKGAYKDGYRLTLVKQPAEPGTSQEPGLASSPSGCPRLSSCSEPQVSQFIRKHVAS 300 

1265 CLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSE 1324 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II II I I I I I I I I I I 
301 SLLVSDTSTELSYILPSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFLKVSE 360 

1325 EDQSLENSE^VDVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARG 1384 
I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I II I I II I I I I I I I I I I I 
361 EDQSLENSEADVKESRKDVLPGAEGLTAVGGQAGNLARCSELAQSQASLQSASSVGSARG 42 0 

1385 DEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLL 1444 
: I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I :: I I I I I I I I : I I I J : I I I I I I I 
421 EEGTGYSDGYGDYRPLFDNLQDPDNVSLQEAEMEALAQVGQGSRKLEGWWLKMRQFHGLL 48 0 

1445 VKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGN 1504 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
481 VKRFHCARRNSKALCSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGN 540 

1505 FI P YANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGSLGPTLNLS SGES 1564 
I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
541 FIPYANEERQEYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPMLNLSSGES 600 

1565 RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDED-LQAWNVSLPPTAGPE 1623 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I 
601 RLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPVKPDEDSLQAWNMSLPPTAGPE 660 

1624 MWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYL 1683 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | || | | | | | | | | | | | | 
661 TWTSAPSLPRLVHEPVRCTCSAQGTGFSCPSSVGGHPPQMRVVTGDILTDITGHNVSEYL 72 0 

1684 LFT S DRFRLHRYGAIT FGNVLKS I PAS FGT RAP PMVRKI AVRRAAQVFYNN KG YHSMPTY 1743 
I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I III I I I I I I I I I I II 
721 LFT SDRFRLHRYGAITFGNVQKS I PAS FGARVP PMVRKI AVRRVAQVLYNNKGYHSMPTY 7 8 0 

1744 LNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAM 18 03 
11111111111111111111111 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 LNSLNNAILRANLPKSKGNPAAYKITVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAM 840 

1804 S FVPAS FWFLVAEKSTKAKHLQFVSGCNP 1 1 YWLANYVWDMLNYLVPATCCVI I LFVFD 1863 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

8.41 S FVPAS BVVFLVAEKSTKAXHLQFVSGCNPVI YWLANYVWDMLNYLVPATCCVI I LFVFD 900 



Qy 1864 L PAYT S PTN FPAVLS LFLLYGWS I T P IMYPAS FWFEVP S SAYVFLI VI NLFI GI TAT VAT 1923 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 L PAYT S P TN F P AVL SLFLLYGWSITPI MY PAS FW F EVP S S AYVFL IVINLFIGI TAT VAT 960 

Qy 1924 FLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSP 1983 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 FLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKI GQ FDKMKSP 1020 

Qy 1984 FEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLR 2043 

I I I I I I I I I I I I I III III I I I I I I I I I I I : I I I : I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 FEWDIVTRGLVAMTVEGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQRVLR 1080 

Qy 2044 GDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGV- RPGECFGLLGVNGAGKTST FKMLT 2102 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 GDADNDMVKI ENLTKVYKSRKI GRI LAVDRLCLGVCVPGECFGLLGVNGAGKT ST FKMLT 1140 

Qy 2103 GDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKD 2162 

I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I III 
Db 1141 GDESTTGGEAFVNGHSVLKDLLQVQQSLGYCPQFDVPVDELTAREHLQLYTRLRCIPWKD 1200 

Qy 2163 EAJ^VVKWALEKLELTKYTVDKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKAR 2222 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 EAQWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKAR 1260 

Qy 2223 RFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYM 2282 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1261 RFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLHCLGSIQHLKNRFGDGYM 1320 

Qy 2283 ITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVL 2342 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I I I II I I I II I I I I I I I I I I I I I III 
Db 1321 ITVRTKSSQNVKDWRFFNRNFPEAHAQGKTPYKVQYQLKSEHISLAQVFSKMEQWGVL 1380 

Qy 2343 GIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRA 2402 

I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I 11:1 I I II I I II I I I I I I I I I I I 
Db 1381 GIEDYSVSQTTLDNVFWFAKKQSDNVEQQEAE-PSSLPSPLG-LLSLLRPRPAPTELRA 1438 

Qy 2403 LVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1439 LVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 1472 
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Query Match 33.5%; Score 4240.5; DB 12; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 0; 

Matches 1000; Conservative 345; Mismatches 730; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFI FLI LI SVRLS YPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I II :| : : I : : I I : I |:: |: 
Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I: : It | :: |: :| I I II Mill 

Db 121 SMKDMRKVLRTLQQIKKSSSNLKLQDFLVDNETFSG FLYHNLSLP 165 



Qy 174 NSTAQALIxAARVDPPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : | | : : 
Db 166 KSTVDKMLRADV 1 LHKVFLQGYQLHLTS - LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II I | : : : | : | : | | : : | : 

Db 205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

Qy 294 AK- VSQQLGLDAPNGS DS S PQAP P PRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKEIAEA--TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GT GAGAVMG PN ATAEE GAP S AAALAT P 396 

: : I : III : I : I I I I : II 

Db 294 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 



QY 397 — DTLQGQCSAFVQ— LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I :: I: : :| 1=1-1 I 
Db 354 YCNDLMKNLES S PLSRI I WKALKPLLVG 381 

QY 452 HLMT SN PKI LYAPAGS EVDRVI LKANET FAFVGNVTHYAQVWLN I SAEI RS FLEQGRLQQ 511 

I I I I I : | : : | : | | : : I : I : I : I : I : 

Db 382 KILYTPDTPATRQVTVIAEWKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

QY 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

Ml I I I I I I I : I I : I : I 

Db 435 LVT^LLDSRDNDHFWEQQLDGLDWTAQDIVAFL7VKHPEDVQSSNGSVYTWREAFNETN — 492 



Qy 54 8 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : I I : I I I : : : : I : : I : : I : I 

Db 493 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME--LLDERKFWAG 535 

QY 608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

: : I M I I I I I I I I : I : I I : I : Mill I I I : 

Db 536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMRYVWGGFAY 595 

Qy 663 IQD^ERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAM 722 

:||::|:IM I : : I |:| Mill III I : Ml I : : I : I I I I : 
Db 596 LQDVVEQAIIRVLTGTE-KKTGVYMQQMPYPCWDDIFLRVMSRSMPLFMTLAWIYSVAV 654 

Qy 723 TI QHI VAEKEHRLKEVMKTMGLNNAVHWVAWFI TGFVQLS I SVT ALTAI LKYGQVLMHSH 782 

M I I I II I I I I I : I II M : : I : M I : : I : I I M I I : I : I 
Db 655 1 1 KGI VYEKEARLKETMRIMGLDNS ILWFSWFI S SLI PLLVSAGLLWI LKLGNLLPYSD 714 

Qy 783 WIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHDK 842 

::::||:|:M II: I I I : I |:|:| I I : I I I I I I I I :||: M 
Db • 715 PSWFVFLSVFAWTI LQCFLI STLFSRANLAAACGGI I YFTLYLPYVLC VAWQD 769 

Qy 843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAWMLM 901 

I I I M : I Ml I :||||:| I : I : M Mill I Ml MM:: 
Db 770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

Qy 902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTE^WEWSWPWARTPRLSVM 961 

I M I : M M II M II I I M M I I I I II M I hill: Ml 
Db 830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS— 8 84 

Qy 962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I M M II I I I I : I M I : I I M :: I : I I M M 

Db 885 EIC MEEEPTHLKLGVS I QN LVKVYRDGMKVAVDGLALNF YE GQ I T 929 

Qy 1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 

I I I I I I I I M I I I I I I I I M I I M II : I I I I II : I I II M M M II II I II I II 
Db 930 SFLGHNGAGKTTTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLGVCPQHNVLFDMLT 989 

Qy 1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

I I I I : I I I ' I M : : : : I I : M h I I : I I 1 II II M I II II M I II 

Db 990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 1049 

Qy 1141 GS RAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

N: : I I I I I I I I I I I I : I I II MM II: I I I I : I I I I I I I M I M I I I I I I I I I I I I 
Db 1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

Qy 1201 KCCGS PLFLKGT YGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I II MM I II I : II I 

Db 1110 CCVGS SLFLKNQLGTGYYLTLVKKDVES SLSSCRNS S STVS YLKKEDSVSQS S SDAGLGS 1169 

Qy 1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I M Mill: III II M M I M II M I I II :: 
Db 1170 DHESDTLTIDVS— AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

Qy 1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSETU^VKESRKDVLPGAEGPASGEGHAG 1358 

I I M I : I : M II I I : II II M I I I I : M 
Db 1228 RLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 



Qy 



1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD--NVSLQEV 1415 



Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 
Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 



1268 



1416 



1304 



I I : I : I || | : | | : : : : 

-ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 



EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 14 74 
I : I I : I : I I : : I I : I I ll II III: I I : I I : I I I I I I : I : 
ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 



1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I : | : : 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 1408 

1534 RLPSGVGATCVLKS PANGS LGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

111= :| I: 
1409 TKDPGFGTRCM EGNPI 1424 

1594 P S PAP S D S PAS P DE DLQAWNVS L P PT AG P EMWT SAP S L P RLVRE PVR C 1641 

II I I I I I I : I I : I : : : : | 

1425 PD -TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 14 63 

1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

M: M II III: I I I I I : I I I : I : I I : I : 

14 64 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

1691 RLHRYGAITFG — NVLKSIPASFGTRAPPMVRK 1721 

I I I : I I I : I : : I 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAIKQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

1722 I AVRRAAQVFYNNKGYHSMPT YLN S LNNAI LRANL PKS KGN PAAYGI TVTNHPMNKT S AS 17 81 

: I = I :: I I I I : I : : : : I I : I I I I I I I | | | : | | : | | | | I j I : I | 
1584 LDTRNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

1782 LS-LDYLLQGTDVVIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 
I I : : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I I I I : I I I : I I I I : I 

1643 LS EVALMTT S VDVLVS I CVI FAMS FVPAS FWFLI QERVS KAKHLQFI S GVKPVI YWLSN 1702 

1841 YVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 

Mill 11:1111 :M I :| I II I : | | I I I : I I I I I I I : : 

1703 FVWDMCNYWPATLVI 1 1 FICFQQKS YVSSTNLPVIALLLLLYGWSITPLMYPAS FVFKI 1762 

1901 PSSAWFLIVINLFIGITATVATFLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLME 1960 

I I : I I I I : I I I I I I : I I I I : I : I I I I : | I I I I I I I I :: I I I I : : 
1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGVVGFLLTIMCQYNFLRRPQR 2020 

I I * s : : I : : : II I I : I I I I I I I I I I I I I : | : : | | | | | : 

1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKI ENLTKVYKSRKI GRI LAVDRLCLGVR 2079 

• hllll Nihil | | ::: | : | | | : | : : | I I I I : I : I : 
1881 VNAKLSPLNDEDEDVRRERQRILDGGGQNDILEIKELTKIYRRK RKPAVDRICVGIP 1937 

2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

II I II III III II 11:11 III III I : I |:| |: | :|:| : :| |::||||| ||: 
1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 
: I I HI- : Ml: I : M Ml: II I II M II I I I I I I I I I I I I : 



Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTGRS WLTSHSMEECEALCTRLAIM 2259 

Mil I : I I I I I I I I I I I I I I I I I I I I ::| I I I I I I I I I I I I I I I I I I I I : I I I 
Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 

Qy 22 60 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I | II II : : I I I : I : | 
Db 2118 WGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

IN I I I I :: I I : I I I I I I I I I I I I I I I I I I I If I I I : 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 
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; PRIOR FILING DATE: 2000-03-15 

; PRIOR APPLICATION NUMBER: 60/124,702 

; PRIOR FILING DATE: 1999-03-15 

; PRIOR APPLICATION NUMBER: 60/138,048 

; PRIOR FILING DATE: 1999-06-08 

; PRIOR APPLICATION NUMBER: 60/139,600 

; PRIOR FILING DATE: 1999-06-17 

; PRIOR APPLICATION NUMBER: 60/151,977 

; PRIOR FILING DATE: 1999-09-01 

; NUMBER OF SEQ ID NOS : 287 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 1 ' 
LENGTH: 2261 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-452-510-1 

Query Match 33.5%; Score 4240.5; DB 15; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 0; 

Matches 1000; Conservative 345; Mismatches 730; Indels 435; Gaps 61; 

QY 6 QLQLLLWKNVTLKRRSPWVLAFEI FI PLVLFFILLGLRQKKPTI SVKEVPFYTAAPLTSA 65 

M : I I M I I : I : I I | |: I :| ||: :| | | | | : || 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

QY 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : | | : | | : : | : 

Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 



Qy H6 SLGSELEALR— QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 



i . .Ill II Mill 

121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNET FSG FLYHNLSLP 165 

174 NSTAQALI^^VRVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

M =111 : I : I I Ml : | |:: 
166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

2 34 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II I I : : : I : I : II : : I : 

205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

2 94 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : • I : I I : I : I III I : I : : | | 

236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

342 ALAL L L PQ GAC T GRT P G P PAS GAGGAAN GT GAGAVMG PN AT AE E GAP S AAALAT P 396 

: - : I : Ml ■ : I : I I I I : II 

2 94 S S S STQI YQAVS RI VCGHPEGGGLKI KSLNW YEDNNYKALFGGNGTEEDAET FYDNSTTP 353 

397 DTLQGQCSAFVQ — LWAGLQP I LCGNNRT I E PEALRRGNMS S LGFT S KEQRNLGLLV 451 

Is: I : : : I I : I : I I 
354 YCNDLMKNLESSPLSRIIWKALKPLLVG — 381 

452 H LMT S N P K I L YAP AG S EVD RVI LKAN ET FAFVGNVT H YAQVWLN I S AE I RS FLEQ G RLQQ 511 

I I I I I : | : : | : | | : : | : | : | : | : | s 

382 KILYTPDTPATRQVMAEWKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 54 7 

= 11 I I I I I I I : I I : |:| 

435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN-- 492 

548 LPSGMALLQQLDTIDN7UVCGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : I I : | | | : : : : | : : | : : | : | 

4 93 QAIRTI.S RFMECVNLNKLEPIATEVWLINKSME — LLDERKFWAG 535 

608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

-I M II II MM : |:||:|: Mill I M : 

536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPR7VDPFEDMRYVWGGFAY 595 

663 IQDmERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCWISWVYSVAM 722 

: I I : M : I I I I : : I I : I I I I I I Mil: I I I I : : I : I II I : 

596 LQDVVEQAIIRVLTGTE-KKTGVYMQQMPYPCWDDIFLRVMSRSMPLmTLAWIYSVAV 654 

723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I- M Ml MM |: Ml:|:: I MM: : I : I I III I M M 
655 II KGI VYEKEARLKETMRIMGLDNS I LWFSWFI S SLI PLLVSAGLLWI LKLGNLLP YSD 714 

783 WIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHDK 842 

: : : : I I : I : I I M : I I I - I I : I : I I I : I I I I I M I MM: II 
715 PSWFVFLSVFAWTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I I I I M III I :||||:| I : I : I I Mill I III :|:|:: 

770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

902 VDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 
I M I : : M I I I I I II II : I I II I II Mill Mill: Ml 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE— -ESDEKSHPGSNQKRIS— 884 

962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

• I I I I I I I I I I I : I I I I : I | : | :: | : | | | | | : 

8 85 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

1022 S FLGHNGAGKTTTMS I LTGLFP PT SGSAT I YGHDI RTEMDEI RKNLGMCPQHNVLFDRLT 1081 

I I I I I I I I I I I I I I I I I I I I : I I I |||:|| I I : I I I : I | | I | | | | | I I 

930 SFLGHNGAGKTTTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLGVCPQHNVLFDMLT 989 

1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 
I I I I : I I I : I I I : : : : : : I I : : I I : I I : I I I I I I I : I I I I I I : I I I I 

990 VEEHIWFYARLKGLSEKHVKAEMEQM^DVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 

1141 GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

11= = I I I I I I I I I I I I : I I I I : I : I I I : II I I M I M I I I I II M I I I I I I I I II I I 
1050 GSICWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

1201 KCCGS PLFLKGT YGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I I I I I I I I I I I : I : M | 

1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I M Mill: III I I : I : I I Ml I : I I I J I :: 
1170 DHESDTLTIDVS — AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSE7VDVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : I I I I I : I I I I : I I I I I : | | 
1228 RLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 

1359 NLARC S ELTQSQAS LQS AS S VGSARGDEGAGYTDVYGD YRPLF- DNPQDPD — NVS LQEV 1415 

I I : I : I II I : | | : : : : 

12 68 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

1416 EAEAL S RV- GQGS RKLDGGWLKVRQ FH GLLVKRFHCARRN S KAL FS Q I LL PAFFVC VAMT 1474 

I : I I : I : I I : : I I : I I I I I I I I I : I I : I I : I | | II I : I : 
1304 ETDLLSGMDGKGSYQVKGWKLTQQQFV7VLLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

Ml I I : I I : : : I I : | : : 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 1408 

1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

I I I: • I I: 
1409 TKDPGFGTRCM- EGNPI 1424 



1641 



1594 PSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVR C 

M I I | | | | : | | : | : : : : | 

1425 PD T P C Q AGE E E WT TAP - VPQTIMDLFQN GN WTMQN P S P AC 1463 



1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF-- 1690 

M: M M Ml: I II I I M I I : I : I I : I : 

1464 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 



1721 



1691 RLHRYGAI T FG - -NVLKS I PAS FGTRAP PMVRK ■ — 

III : I I I: I ::| 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 



Qy 1722 IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 17 81 

• I : I : : I I I I : I : : : : I I : I I I I I I I I I I : I I : I I I I I I I : I I 
Db 1584 LDTRNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

Qy 1782 LS-LDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 

M : : I I : : : I : I I II I I II II II I I : I : : I I I I I I I : I I I : I I I I : I 
Db 1643 LSEVALMTTSVDVLVSICVIFAMSFVPASFWFLIQERVSKAKHLQFISGVKPVIYWLSN 1702 

Qy ' 1841 YVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 

•111111:1111 : I I I : I I I I I : I I I I I I I I I I : I I I I I ! I : : 
Db 1703 FVWDMCNYWPATLVI 1 1 FI CFQQKS YVS STNLPVLALLLLLYGWS ITPLMYPAS FVFKI 1762 

Qy 1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

MUNI : I I I I I I : I I I I : I : I I I I : I I I I I I I II :: I I I I : : 
Db 1763 PSTAYWLTSWLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 



Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : : : II I I : I I I I I I I I I I I I I : I : : | | | I I : 
Db 1822 M\^QAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

Qy 2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

' I : I I I I Nihil I I : : : I : I II : I : : I I I I I : I : I : 
Db 1881 VNAKLS PLNDEDEDVRRERQRI LDGGGQNDI LEI KELTKI YRRK RKPAVDRICVGIP 1937 

Qy 2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I :| I I I I I I I I : I |:||:| :|:| : :| |::||||| ||: 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

slllll::: Mh I : : I : I I : I I I I M I I I I I I I I I II I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAI RKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 

I I I I I : I I I I I I I I I I I I I I I I I I I I ::| II I I I I I I I I I I II I I I II I : I II 
Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 

Qy 2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I II I I I I : I I I I I I I I I I I I I I I : : I I II | | : : | | | : | : | 
Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

Ml I I I I : : I I : I I I I I I I I I I I I I I I I I I I I I III: 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



RESULT 12 
US-10-745-377-5 

; Sequence 5, Application US/10745377 

; Publication No. US20040137423A1 

; GENERAL INFORMATION: 

; APPLICANT: Hayden, Michael R. 

; APPLICANT: Pimstone, Simon 

; APPLICANT: Brooks-Wilson, Angela R. 

; APPLICANT: Clee, ' Susanne M. 

; TITLE OF INVENTION: Compositions and Methods for Modulating 
; TITLE OF INVENTION: HDL Cholesterol and Triglyceride Levels 



; FILE REFERENCE: 760050-109 

; CURRENT APPLICATION NUMBER: US/10/745,377 

; CURRENT FILING DATE: 2003-12-23 

; PRIOR APPLICATION NUMBER: 09/654,323 

; PRIOR FILING DATE: 2000-09-01 

; PRIOR APPLICATION NUMBER: US 60/124,702 

; PRIOR FILING DATE: 1999-03-15 

; PRIOR APPLICATION NUMBER: US 60/138,048 

; PRIOR FILING DATE: 1999-06-08 

PRIOR APPLICATION NUMBER: US 60/139,600 
; PRIOR FILING DATE: 1999-06-17 
; PRIOR APPLICATION NUMBER: US 60/151,977 
; PRIOR FILING DATE: 1999-09-01 
; PRIOR APPLICATION NUMBER: US 09/526,193 
; PRIOR FILING DATE: 2000-03-15 
; PRIOR APPLICATION NUMBER: US 60/213,958 
; PRIOR FILING DATE: 2000-06-23 
; NUMBER OF SEQ ID NOS : 256 

SOFTWARE: Word for Windows Version 6.0 (ASCII Text) 
; SEQ ID NO 5 

LENGTH: 2261 
TYPE: PRT 

ORGANISM: homo sapien 
US-10-745-377-5 

Query Match 33.5%; Score 4240.5; DB 16; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 0;. 

Matches 1000; Conservative 345; Mismatches 730; Indels 435; Gaps 61; 

QY 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I | : | | 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I : : I : 

Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL — — LYSQKDT 120 

Qy H6 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

Is : II I :: I: :| I I II Mill 

Db 121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNET FSG FLYHNLSLP 165 

Qy 174 NSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : I I : : 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 2 04 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II | | : : : | : | : I I : : I : 

Db 205 — -QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

Qy 294 AK- VSQQLGLDAPNGS DS S PQAP P PRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I Ml : I : I III | : | : :| | 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GTGAGAVMGPNATAEEGAPSAAALATP 396 

: : I : I I I : hill h II 

Db 294 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 



397 DTLQGQC S AFVQ — LWAGLQP I LCGNNRT I EPEALRRGNMS S LGFT S KEQRNLGLLV 451 

I : : I : : : I I : I : I I 
354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

452 H LMT S N P K I L YAP AG S EVDRVI LKAN ET FAFVGNVT H YAQ VW LN I S AE I RS FLEQ GRLQQ 511 

Mill :|: : |:|| : :| :| :| :|:| : 

382 ■ KILYTPDTPATRQVMAEVNKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

512 HLRWL : QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

•II I I I I I I I : I I : I : I 

435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN-- 492 

548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : I I : I I I : : : : | : : | : : | : | 

4 93 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME — LLDERKFWAG 535 

608 VI FQTRKDGS — LPPHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTG GRFYFL YGFVW 662 

: : I I I I I I I I I I I : I : I I : I : I I I I I | | | : 

536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMRYWGGFAY 595 

663 IQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVI SWVYSVAM 722 

: I I : : I : I I I I : : I I : I Mill I I I I : III I : : I : | I | I : 

596 LQDWEQAIIRVLTGTE-KKTGVYMQQMPYPCWDDIFLRVMSRSMPLFMTLAWIYSVAV 654 

723 TIQHI VAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLS I SVT ALTAI LKYGQVLMHSH 782 

I : I I I M M M I : I M : I : : I : M I : : I : I I II I I : I : | 
655 II KGI VYEKEARLKETMRIMGLDNS I LWFSWFI S SLI PLLVSAGLLWI LKLGNLLP YSD 714 

783 WT I WL FLAVYAVAT I MFC FLVS VL Y S KAK LAS AC GG 1 1 Y FL S YVP YMYVAI RE EVAH DK 842 

: : : : M : I : I I II: II I M I : I : I I I : I I I I II I I I : II : II 
715 PSVVFVFLSVFAVOTI LQCFLI STLFSRANLAAACGGI I YFTLYLP YVLC VAWQD 769 

843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I M I M llll:||||:| |:|:|l Mill I III MM:: 
770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

902 VDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I M I : M I I I I II I I I I M I II I I I, I I I I I |:|||: | : | 

830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS— 884 

962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

' I MINIM I I : I IMM |:|:: IMI II |: 

885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 
M I II I II I II I II I I I I II I II I M M I I I I I M I I I M I I M II I III I I II 
930 SFLGHNGAGKTTTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLGVCPQHNVLFDMLT 989 

1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 
I M I M I I : I I I :::: :: IMM |: I IM I I I I I I M I II I I M I I I 

990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 

1141 GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

I I: M I I II II II I II M I II M M II : I I I I M I II I I I I I I M II I I I I I I II II 
1050 GSKWI LDEPTAGVDP YS RRGIWELLLKYRQGRT 1 1 LSTHHMDEADVLGDRI AI I SHGKL 1109 



1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG 



GPQEPGLAS 1238 



I I I I I I I 11111111:1 : I I I 

Db 1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

Qy 1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : I Mill: Ml II : I : I I M I I : M I II : : 

Db 1170 DHESDTLTIDVS— AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

Qy 12 99 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPG7VEGPASGEGHAG 1358 

I I : I I : I : : I II I I : I I II : II I.I I : II 
Db 1228 RLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 

Qy 1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD— NVSLQEV 1415 

II : I : I II |: ||: :: : 

Db 1268 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

Qy 1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVA^4T 14 74 

I : I I : I : I I : : I | : | | I I | I I I I : | I : I I : I I I I I I : I : 

Db 1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLI7VRRSRKGFFAQIVLPAVFVCIALV 1363 

Qy 1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

MM I I I I I I : II : : : I - I : I : : 
Db 1364 FSL1VPPFGKYPSLELQPWMYNEQYT ; — FVSNDAPE DTGTLELLNAL 14 08 

Qy 1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: : I I: 
Db 14 09 TKDPGFGTRCM EGNPI 1424 

Qy 1594 P S PAP S D S PAS P D E DLQ AWN VS L P P TAG P EMWT SAP S L P RL VRE PVR C 1641 

II I M I I I : I I M : : : : I 

Db 1425 PD T P C Q AG E E EWT TAP - VP QT I MDL FQN GNWTMQN P S P AC 14 63 

Qy 1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

M: II II Ml: I I I I I : II I : I : II : I : 

Db 1464 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

Qy 1691 RLHRYGAITFG — NVLKS I PAS FGT RAP PMVRK 1721 

I I I : I I I : I : : I 

Db 1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

Qy 1722 IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 17 81 

: I : I : : I I I I : I : : : M I : I I I I I I I I I I : M : II II I I I : I I 
Db 1584 LDTRNNVKVWFNNKGWHAISSFLNVINNA1LRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

Qy 1782 LS-LDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 

II : : I I : : M : I I I I I I I M I I I I I : I : : I I I I M I : I I I : I I I I M 

Db 1643 LS EVALMTTS VDVLVS I CVI FAMS FVPAS FWFLI QERVS KAKHLQFI S GVKPVI YWLSN 1702 

Qy 1841 YVW DMLN Y L VP AT C CVI I L FVFD L P AYT S P TN F P AVL S L FL L Y GW S I T P I MY PAS FW FE V 1900 

Mill I I : I I I I Ml I MIMI: I M I I M I I I : I I I I I I |:: 
Db 1703 FVWDMCN YWPAT LVI 1 1 F I C FQQ K S YVS S TN L P VLALL L L L YGW S I T P LMY PAS FVFK I 1762 

Qy 1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

I I M I I I MIIMI : I I I I : I : I I I I M I I I II M I : : I I I I : : 
Db 1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGVVGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : :: I I . I I : I I | M I I I I I I I I M : : I I I I I : 



Db 



1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 



Qy 2 021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I : I I I I I I I I : I I I | : : : I : I I I : I : : I I I I I : I : I : 
Db 1881 VNAKLSPLNDEDEDVRRERQRILDGGGQNDILEIKELTKIYRRK RKPAVDRICVGIP 1937 

Qy 2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I : I I I I I I I I I : I I : I I : I : I : I : : I I : : I I I I I I I : 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 214 0 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: I I I I I : : : III: I : : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 

MM I : I I II I I I II II II I I I I II I ::| I I I I I I I I I I I I I I I I j I I I : I I I 
Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSVVKEGRSWLTSHSMEECEAiCTRMAIM 2117 

Qy 2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I II I I I I II I I I I I : Ml II I I : : I I I M : I 
Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

I I I I I I I : M I : I I I I I I I II I I II I I I I I I II Ml: ■ 
Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 

RESULT 13 
US-10-744-465-1 

Sequence 1, Application US/10744465 
Publication No. US20040157250A1 
GENERAL INFORMATION: 
APPLICANT: Hayden, Michael R. 
APPLICANT: Brooks-Wilson, Angela R. 
APPLICANT: Pimstone, Simon N. 

TITLE OF INVENTION: METHODS AND REAGENTS FOR MODULATING CHOLESTEROL LEVELS 
FILE REFERENCE: 760050-92 

CURRENT APPLICATION NUMBER: US/10/744,465 
CURRENT FILING DATE: 2003-12-23 
PRIOR APPLICATION NUMBER: 10/617,334 
PRIOR FILING DATE: 2003-07-10 
PRIOR APPLICATION NUMBER: US 09/526,193 
PRIOR FILING DATE: 2000-03-15 
PRIOR APPLICATION NUMBER: 60/124,702 
PRIOR FILING DATE: 1999-03-15 
PRIOR APPLICATION NUMBER: 60/138,048 
PRIOR FILING DATE: 1999-06-08 
PRIOR APPLICATION NUMBER: 60/139,600 
PRIOR FILING DATE: 1999-06-17 
PRIOR APPLICATION NUMBER: 60/151,977 
PRIOR FILING DATE: 1999-09-01 
NUMBER OF SEQ ID NOS : 287 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 1 
LENGTH: 2261 
TYPE: PRT 

ORGANISM: Homo sapiens 



US-10-744-465-1 



Query Match 33.5%; Score 4240.5; DB 16; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 0; 

Matches 1000; Conservative 345; Mismatches 730; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I - I : : I : 

Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : | | | :: | : : I I I I I I I I I I 

Db 121 SMKDMRKVLRTLQQIKKSSSNLKLQDFLVDNETFSG FLYHNLSLP 165 

Qy 174 NSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II :| I I : I :| I III : I |:: 
Db 166 KSTVDKMLRADV 1 LHKVFLQGYQLHLT S - LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 2 93 

II I I : : : | : | : I I : : I : 

Db 205 QL GDQEVSELCGLPREKLAAAE - RVLRSNMDI 235 

Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GT GAGAVMG PNAT AE E GAP S AAALAT P 396 ■ 

: : I : III : I : I I I I : II 

Db 294 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 

Qy 397 DTLQGQCSAFVQ— LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I : : I : : : I I : I : I I 
Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 452 HLMT SNPKI LYAPAGS EVDRVI LKANETFAFVGN VTHYAQWLNI SAEI RS FLEQGRLQQ 511 

I I I I I : | : : | : | | : : | : I : I : I : I : 

Db 382 KILYTPDTPATRQVMAEVNKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 4 34 

Qy 512 HLRWL ■ QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

: I I I I I I I I I : I I : I : I 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN— 492 

Qy 54 8 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : II :|| |::.: : I ::| :: I :| 

Db 493 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME— LLDERKFWAG 535 

Qy 608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

: : I II II II I I I I : I : I I : I : I I I I I I I I : 

Db 536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMRYVWGGFAY 595 

Qy 663 IQDMMERAI I DT FVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVI SWVYSVAM 722 

: I I : : I : I I I I : : I I : I I I I I I Mil: I I I I : : I : I I I I : 

Db 596 LQDVVEQAIIRVLTGTE-KKTGV^QQMPYPCYWDIFLRvMSRSMPLFMTIxAWIYSVAV 654 



723 T I QH I VAEKEH RLKEVMKTMGLNNAVHWVAW FI T G FVQL S I SVT ALTAI L K YGQVLMH S H 782 

I : I I I I I I I I I I : I I I : I : : I : I I I : : I : I I M I I : I : I 
655 IIKGIVYEKEARLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

783 WIIWLFLAWAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHDK 842 

: : : : I I : I : I I I I : I I I : I I : I : I I I :'l I I I I I I I I : I I : II 
715 PSWFVFLSVFAVWILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I I I I : I I I I I : I I I I : I I : I : I I : I I I I I I I I : I : I : : 
770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

902 VDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I :: I I I I I I I I I I I : I I I I I I I I I I I I I : I I I : I : I 

830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS-- 884 

962 EEDQAC7yy[ESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKL7VLNKLSLNLYENQW 1021 

: I I I I I I I II I I :■ I I I I : I I : I : : I : I I I I I : 

885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 
! I I I I I I I I I I I I I I I I I I I I I I I I I : I I I | | | : | | I I : I I I : I I II I I I I I II 
930 S FLGHNGAGKTTTMS I LTGLFPPTS GTAYI LGKDI RSEMSTI RQNLGVCPQHNVLFDMLT 989 

1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 
I I I I : I I I : I I I :::::: I I :: I I : I I : I I I I I I I : I I I I I I : I I I I 

990 VEEHIWFY7VRLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 

1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

II: : I I II I I I I I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I j I 
1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I I I I I I I I I I I : I : I I I 

1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSE7VAKKGAFERLFQHLER 1298 

I : I Mill: III I I : I : I M I I I : I I I II : : 
1170 DHESDTLTIDVS— AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSE7VDVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : I I I I I : I I I I : I I I I I : II 
1228 RLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 

1359 NIARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD— NVSLQEV 1415 

I I : I : I II I : I I : : : 

1268 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

1416 ETVEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCAJ^RNSKALFSQILLPAFFVCVAMT 1474 

I : I I : I : I I : : I I : I I I I I I I I I : I I : I I : I I I I I I : I : 
1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFI PYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I : I : : 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT — FVSNDAPE : DTGTLELLNAL 1408 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: : I I : 

EGNPI 



14 09 TKDPGFGTRCM- 



1424 



-C 1641 



1594 P S PAP S D S PAS P D E D LQ AWN VS L P P TAG P EMWT S AP S L P RL VRE PVR 

II I I I I I I : I I : I : : : : I 

1425 PD TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 14 63 

1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II II III: I I I I I : I I I : I : I I : I : 

14 64 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

1691 RLHRYGAIT FG — NVLKS I PAS FGTRAP PMVRK 1721 

I I I : I I I : I : : I 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

1722 IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 1781 

: I : I :: I I I I : I : : : : I I : I I I I I I I I I I : I I : I I I I I I I : I I 
1584 LDTRNWKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

1782 LS-LDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 
II: : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I I I I : I I I : I I I I : I 

1643 LS EVALMTT S VDVLVS I CVI FAMS FVPAS FVVFLI QERVSKAKHLQFI S GVKPVI YWLSN 1702 

1841 YVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 

: I I I I I I : I I I I : I I I : I I I I I : I I I I I I I I I I : I I I I I I I : : 
1703 FVWDMCNYWPATLVI 1 1 FI CFQQKS YVS STNLPVLALLLLLYGWS ITPLMYPAS FVFKI 1762 

1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

I I : I I I I : 1 I I I I I : I I I I : I : I I I : I I I I I I I I I :: I I I I : : 
1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

1961 MAYNEYINEYYAKI GQFDKMKS PFEWDI VTRGLVAMAVEGVVGFLLTIMCQYNFLRRPQR 2020 

I I : : : : | : : : II I I : I I I I I I I I I I I I I : I : : I I I I I : 
1822 MV^QAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I: I I I 1111:11 I I : : : I : I I I : I : : I I I I I : I : I : 

18 81 VNAKLS PLNDEDEDVRRERQRI LDGGGQNDI LEI KELTKI YRRK — - RKPAVDRI CVGI P 1937 

2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDT^lL 2139 

I I I I I I I I I I I I I I I : I I I I I I I I I : I I : I I : I : I : I : : I I :: I I I I I I I : 
1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 19 97 

2140 FDELTAREHLQL YT RLRGI SWKDE7VRWKWALEKLELT KYADKPAGT YS GGNKRKLSTAI 2199 

: I I I I I : : : III: I : : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 

I I II I : I I I I I I I I I I I I I I I I I II I :: I I I I I I I I I I I I I I I I I I I I I : I I I 
2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 

2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I I II I I : : I I I : I : I 
2118 VNGRFRCLGSVQHLKNRFGDGYTI WRI AGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

2319 YQLKS EH I S LAQVFS KMEQVSGVLGI ED YS VSQTT LDNVFVNFAKKQSDN 2368 



Ill I I I I :: I I : I I I I I I I I I I I I I I I I I I I I I III: 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



RESULT 14 
US-10-313-641-9 

Sequence 9, Application US/10313641 
Publication' No. US20030162758A1 
GENERAL INFORMATION: 
APPLICANT: Ishida, Brian 
APPLICANT: Duncan, Keith 
APPLICANT: Bailey, Kathy 
APPLICANT: Kane, John 
APPLICANT: Schwartz, Daniel 

TITLE OF INVENTION: Treatments for Age Related-Ma cular Degeneration (AMD) 
FILE REFERENCE: P02351US2 

CURRENT APPLICATION NUMBER: US/10/313,641 
CURRENT FILING DATE: 2002-12-06 
PRIOR APPLICATION NUMBER: US 60/415,864 
PRIOR FILING DATE: 2002-10-03 
PRIOR APPLICATION NUMBER: US 60/340,498 
PRIOR FILING DATE: 2001-12-07 
NUMBER OF SEQ ID NOS : 12 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 9 
LENGTH: 2261 
TYPE: PRT 
ORGANISM: Human 
US-10-313-641-9 

Query Match 33.5%; Score 4237.5; DB 14; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 0; 

Matches 999; Conservative 346; Mismatches 730; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I : : I : 

Db 65 GTLPWQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : M I : : I : : I I I I I I I I I I 

Db 121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNET FS G FLYHNLSLP 165 

Qy 174 NSTAQALLAARVT)PPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : I I : : 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II | | : : : | : | : I I : : I : 

Db 2 05 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA--TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 



Qy 342 ALALLL PQGACT GRT PG P PAS GAGGAAN GTGAGAVMGPNATAEEGAP SAAALAT P 396 

: : I : III : I : I I I I : II 

Db 2 94 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 

Qy 397 DTLQGQCSAFVQ — LWAGLQPI LCGNNRTI EPEALRRGNMS SLGFTS KEQRNLGLLV 451 

I : : I : : : I I : I : I I 
Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 452 HLMTSNPKI LYAPAGSEVDRVI LKANETFAEVGNVTH YAQVWLNI SAEI RS FLEQGRLQQ 511 

I I I I I : | : : | : | | : : | : | : | : | : I : 

Db 382 KILYTPDTPATRQVMAEVNKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 547 

: I I I I I I I I I : I I : I : I 

Db 4 35 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN — 4 92 

Qy 548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : I I : | | | : : : : ' | : : | : : I : I 

Db 493 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME— LLDERKFWAG 535 

Qy 608 VI FQT RKDG S — L P P HVH YK I RQN S S FT EKTN E I RRAYWRP G PNT G GRFYFLYGFVW 662 

: : I I I I I I I I I I I : I : I I : I : I I I I I I I I : 

Db 536 I VFTGITPGS I ELPHHVKYKI RMDI DNVERTNKI KDGYWDPGPRADPFEDMRYVWGGFAY 595 

Qy 663 IQDMMERAI I DTFVGHDWEPGS YVQMFPYPCYTRDDFLFVI EHMMPLCMVT SWVYSVAM 722 

: I I : : I : I I I I : : I I : I Mill I II h I I I I : : I : I I I I : 

Db . 596 LQDWEQAIIRVLTGTE-KKTGVYMQQMPYPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 654 

Qy 723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I : I I I I I I I I I I : I I I : I : : I : I I I : : I : I I I I I I : I : I 
Db 655 1 1 KGI VYEKEARLKETMRIMGLDNS I LWFSWFI S SLI PLLVSAGLLWI LKLGNLLP YS D 714 

Qy 7 83 WIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAHDK 842 

: : : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
Db 715 PSWFVFLSVFAWTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLC VAWQD 769 

Qy 843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I Ml:! I I I I : I I I I : I I : I : I I : I I I I I I I I : I : I : : 
Db 770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

Qy 902 VDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSY^LGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I : : I I I I I I I I I I I : I I I I I I I I I I I I hill: hi 

Db 830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS — 884 

Qy 962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I I I I I I I I I I : I I I I : I I : I : : I : I I I I I : 

Db 8 85 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

Qy 1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 

I I I I I I I I I I I I II I I I I I I I I I I I I : I I I 111:11 I I : I I I : I I I I I I I II II 

Db 930 SFLGHNGAGKTTTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLGVCPQHNVLFDMLT 989 

Qy 1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

1111:111:111 : : : : :: I I :: I h i hi I I I I I h I I I I I h I I I I 
Db 990 VEEHIWFYARLKGLSEKHVKAEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 



Qy 1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

II: : I I I I I I I I II I I : I I I I : I i I I I : I I I I : I I I I I I I I I I : II I I I I I I I I I II 

Db 1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

Qy 1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG — GPQEPGLAS 1238 

I I I I I I I 11111111:1 : II I 

Db 1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

Qy 1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : I I I I I I : III I I : I : I I I I I I : I I I II : : 

Db 1170 DHES DTLTI DVS — AI SNLI RKHVSEARLVEDI GHELTYVLPYEAAKEGAFVELFHEI DD 1227 

Qy 12 99 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : M I I I : I M I : I I I I I : II 

Db 1228 RLSDLGI SS YGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 

Qy 1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

I I : I : I II | : | | : : : : 

Db 1268 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

Qy 1416 EAEALS RV- GQGS RKLDGGWLKVRQFHGLLVKRFHCAJ^RNS KALFSQI LLPAFFVCVAMT 1474 

I : I I : I : I I : : I I : I I I I I I I I I : I I : I I : I I I I I I : I : 

Db 1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

Qy 1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I : I : : 

Db 1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 14 08 

Qy 1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: :| |: 

Db 14 09 TKDPGFGTRCM- EGNPI — 1424 

Qy 1594 P S PAP S D S PAS P DEDLQAWNVS L P PTAGP EMWT SAP S L P RLVRE PVR C 1641 

II | | | | | | : | | : | : : : : | 

Db 1425 — PD TPCQAGEEEWTTAP-VPQTIMDLFQNGNWTMQNPSPAC 1463 

Qy T642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II Mill: I I I I I : I I I : I : I I : I : 

Db 1464 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

Qy 1691 RLHRYGAIT FG — NVLKS I PAS FGTRAP PMVRK 1721 

III * I I I : I : : I 

Db 1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

Qy 1722 I AVRRAAQ VF YNN KG YH SMP T Y LN S LNN AI L RAN L P K S KGN P AAYG I T VTNH PMN KT S AS 1781 

: : : | : : I | | I : I : : : : I I : I I I I I I I I I I : I I : I I I I I I I : I I 

Db 1584 LDTKNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

Qy 1782 LS-LDYLLQGTDWIAIFIIVAMS FVPAS FWFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 

II : : I I : : : I : I I I I I I I I I I I I I I : I : : I I I I I I I : I I I : I I I I : I 

Db 1643 LSEVALMTTSVDVLVSICVIFAMSFVPASFVVFLIQERVSKAKHLQFISGVKPVIYWLSN 1702 

Qy 1841 YVWDMLN YLVPATCCVI I LFVFDL PAYT S PTNFPAVLS LFLLYGWS I T P IMYPAS FWFEV 1900 

: I I I I I I : I I I I : I I I • I I II I • I I I I I I I I I I : I I I I I I I : : 

Db 1703 FVWDMCNYWPATLVIIIFICFQQKSYVSSTNLPVLALLLLLYGWSITPLMYPASFVFKI 1762 

Qy 1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 



I I : I I I I : I I I I I I : I I I I : I : I I I I : I I I I I I I I I : : I I I i : : 

Db 1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : : : II I I : I I I I I I I I I I I I I : h : I I I I I : 

Db 1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

Qy 2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I : I I I I 1111:11 I I : : : I : I I I : I : : I I I I I : I : I : 
Db 1881 VNAKLS PLNDEDEDVRRERQRI LDGGGQNDI LEI KELTKI YRRK RKPAVDRICVGIP 1937 

Qy 2080 PGECFGLLGWGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I : I I I I I I I f I : I I : I I : I : I : I : : I I : : I I I I I I I : 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 214 0 FDELTAREHLQLYTRLRGISWKDEARVVKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: I I I I I : : : III: I : : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAI RKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 

I I II I : I I I I I I I I I I I I I I I I I I I I :: I I I I I I I I I I I I I I I I I I I I I : I I I 

Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 

Qy 2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I : I I I I I I I I I I I I I I I : : I I II I I : : I I I : I : I 
Db 2118 VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

Qy 2319 YQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

III I I I I : : I I : I I II I I I I I I I I I I I I I II I I III: 

Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 
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US-10-313-641-10 
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; FILE REFERENCE: P02351US2 

; CURRENT APPLICATION NUMBER: US/ 10/313 , 64 1 

; CURRENT FILING DATE: "2002-12-06 
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Query Match 33.5%; Score 4237.5; DB 14; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 0; 

Matches 999; Conservative 346; Mismatches 730; Indels 435; Gaps 



61; 



Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYT7VAPLTSA 65 

II : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I : : I : 

Db 65 GTLPWQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 12 0 

Qy 116 SLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : I I I : : I : : I II I I I I I I I 

Db 121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNETFSG FLYHNLSLP 165 

Qy 174 NSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I I I I : I I : : 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II I I : : : | : | : I I : : I : 

Db 205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

Qy 2 94 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : . I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALLLPQGACTGRTPGPPASGAGGAAN GT GAGAVMG PN ATAE E GAP S AAALAT P 396 

: : I : III : I : I I I I : II 

Db 2 94 SSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 

Qy 397 DT LQGQCS AFVQ — LWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLV 451 

I :: I : : : I hhl I 
Db 354 YCNDLMKNLESSPLSRI1WKALKPLLVG 381 

Qy 452 HLMTSNPKILYAPAGSEVDRVILKANETFAFVGNVTHYAQWLNISAEIRSFLEQGRLQQ 511 

I I I I I : | : : | : | | : : | : | : | : | : | : 

Db 382 KILYTPDTPATRQVKAEWKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL QQYVAELRLHPE ALNLSLDELPPALRQDNFS 54 7 

: I I 1 Ill III : I I : I : I 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN — 492 

Qy 548 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : I I : | | | : : : : | : : | : : | : | 

Db 493 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME--LLDERKFWAG 535 

Qy 608 VI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG GRFYFLYGFVW 662 

: : I II I I I I I I I I : I : I I : I : I I I I I I I I : 

Db 536 IVFTGITPGSIELPHHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFEDMRYVWGGFAY 595 

Qy 663 IQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVI EHMMPLCMVISWVYSVAM 722 

:||::|:||| I : : I hi Mill Mil: I I I I : : I : I I I I : 
Db 596 LQDWEQAIIRVLTGTE-KKTGWMQQMPYPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 654 



723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSI SVT ALTAI LKYGQVLMHSH 782 

I : I I I I I I I I I I : I I I : I : : I : I I I : : I : I I I I I I : I : I 
655 IIKGIVYEKEARLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

783 WT I WL FLAVYAVAT IMFC FLVS VL Y S KAKLAS AC GG 1 1 YFL S YVP YM YVAI RE EVAH DK 842 

: : : : I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : II 
715 PSWFVFLSVFAWTILQCFLI STLFSRANLAAACGGI I YFTLYLPYVLC VAWQD 769 

843 ITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLM 901 

I I I I I : I I I I I : I I I I : I I : I : I I : I I I I I I I I : I : I : : 

770 YVGFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMML 829 

902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I :: I II I I I I I I I I : I I I I I I I II I I I hill: I : I 

830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS— 884 

962 EEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I I I I I I I I I I : lllhl I : I : : I : I I I I I : 

885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 
I I I I I I I I I I I I I I I I I I I I I I I I II : I I I I II : I I I I : I I I : I I I I I I I I I II 
930 S FLGHNGAGKTTTMS I LTGLFPPTS GTAYI LGKDI RS EMSTI RQNLGVCPQHNVLFDMLT 989 

1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 
I I I I : I I I : I II :::::: I I : : I I :' I I : I I I I I I I : I I I I I I : I I I I 

990 VEEHIWFYARLKGLSEKHVK7VEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 104 9 

1141 GSRAI ILDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKL 1200 

II: : II I I I I I I I I I I : I I I I : I : I I I : I I I I : II I I II I I I I : I I I I I I I I I I I I I 
1050 GSKWI LDEPTAGVDPYSRRGIWELLLKYRQGRTI I LSTHHMDEADVLGDRIAI I SHGKL 1109 

1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I I I I I I I I I I I : I : I I I 

1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEA7VKKGAFERLFQHLER 12 98 

I : I INN: III I I : I : I I I I I I : I I I II : : 
1170 DHESDTLTIDVS— AISNLIRKHVSEARLVEDIGHELTYVLPYEAAKEGAFVELFHEIDD 1227 

1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : I I I I I : I I I I : I I I I I : II 
1228 RLSDLGI SS YGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 

1359 NLARCSELTQSQASLQSASSVGSARGDEGAGYTDWGDYRPLF-DNPQDPD — NVSLQEV 1415 

II: I : I II I : II: : :■ : 

1268 ARRNRRA- FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMT 1474 

1:11:1:11::! I Ml II II I I I : I I : I I : I I I I I I : I : 
1304 ETDLLSGMDGKGSYQVKGWKLTQQQFV7VLLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 

1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I :|:: 

1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 1408 



1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I I I: 

14 09 TKDPGFGTRCM- 



: I I : 

-EGNPI- 



1424 



1594 P S PAP S D S PAS P D E D LQ AWN VS L P P TAG P EMWT SAP S L P RL VRE PVR C 1641 

II I I I I I I : I I : I : : : : I 

1425 PD T P C QAG E E EWT TAP -VPQTIMDL FQN GNWTMQN P S P AC 14 63 

1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II II III: I I I I I : I I I : I : I I : I : 

14 64 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

1691 RLHRYGAITFG — NVLKS I PAS FGTRAP PMVRK 1721 

I I I : I I I : I : : I 

1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDAI KQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

1722 IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSAS 1781 

: : : I : : I I I I : I : : : : I I : I I I I I I I I I | : I [ : | | j | | | | : | | 
1584 LDTKNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

1782 LS-LDYLLQGTDWIAIFIIVAMSFVPAS FVVFLVAEKSTKAKHLQFVSGCNPIIYWLAN 1840 
II: : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I I I I : I I I : I I I I : I 

1643 LSEVAmTTSVDVLVSICVIF7\MSFVPASFVVFLIQERVSKAKHLQFISGVKPVIYWLSN 1702 

1841 YVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 

: I I I I I I : I I I I : I I I : I I I I I : I I I I I I I I I I : I I I I I I I : : 
1703 FVW DMCN YWP AT L VI IIFICFQQKS YVS S TN L P VLAL L L L L YGW S I T P LM Y PAS FVFK I 1762 

1901 PSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

11:1111 :|lllll :MII:|:II I I : I I I I I I I I I :: I I I I : : 
1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

1961 MAYNEYINEYYAKIGQFDmKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : : : II I I : I I I I I I I I I I I I I : I :: I I I I I : 
1822 MVKNQA^4ADALERFGE-NRFVSPLSWDLVGRNLFAIy[AVEGWFFLITVLIQYRFFIRPRP 1880 

2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: I : I I I I I I I I : I I I I : : : I : I I I : I : : I I I I I : I : I : 
1881 VNAKLS PLNDEDEDVRRERQRI LDGGGQNDI LEI KELTKI YRRK RKPAVDRI CVGI P 1937 

208 0 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I I I I I I I I I I I I I I : I I I I I I I I I : I I : I I : I : I : I : : I I : : I I II I I I : 
1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: I I I I I : : : III: I : : I : I I : I I I I I : I I I I I I I I I I I I I I I : 
1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAI RKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

2200 ALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIM 2259 

I I I I I : I I I I I I I I I I I I I I II I I I I : : I I I I I I I I I I I I I I I I I I I I I : I I I 
2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 

2260 VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I II I I III 1:1 Ml I I I I III I 1 I I: : I I I I II ::|||:| :| 
2118 VNGRFRCLGSVQHLKNRFGDGYTI WRI AGSNPDLKPVQDFFGLAFPGSVLKEKHRNMLQ 2177 

2319 YQLKS EH I S LAQVFS KMEQVSGVLGI ED YS VSQTTLDNVFVNFAKKQS DN 2368 
III I I I I :: I I : I I I I I 1 I I I I I I I I I I II I II III: 



Db 2178 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 



Search completed: September 1, 2004, 11:12:58 
Job time : 234 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: September 1, 2004, 10:44:52 ; Search time 209 Seconds 

(without alignments) 
3677.523 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-10-088-467-2 
12668 

1 MGFLHQLQLLLWKNVTLKRR GLISFEEERAQLSFNTDTLC 2436 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1017041 



Database 



SPTREMBL 25:* 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 



sp_archea : * 
sp_bacteria : * 
sp_f ungi : * 
sp_human: * 
sp_invertebrate : * 
spmamrnal : * 
sp__mhc : * 
sp_organelle : * 
sp_phage : * 

sp_plant : * * 

sp_rodent : * 

sp_virus : * 

sp_vertebrate: * 

sp_unclassif ied: * 

sp_rvirus : * 

sp_bacteriap: * 

sp_ar cheap : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
Q9HC28 

ID Q9HC28 PRELIMINARY; PRT; 24 36 AA. 

AC Q9HC28; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 



DE ATP-binding cassette sub-family A member 2 (ABC transporter 

DE ABCA2) . 

GN ABCA2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBIJTaxID=9606; 

RN [1] 

RP SEQUENCE FROM N . A. ■ 

RC TISSUE=Brain; 

RA Vulevic B . , Chen Z., Davis W. Jr., Walsh E.S., Tew K.D.; 

RT "Cloning and characterization of human ABCA2 . " ; 

RL Submitted (AUG-1999) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX PubMed=11178988; 

RA Kaminski W.E., Piehler A., Pullmann K., Porsch-Ozcurumez M., Duong C, 

RA Bared G.M., Buchler C, Schmitz G. ; 

RT "Complete Coding Sequence, Promoter Region, and Genomic Structure of 

RT the Human ABCA2 Gene and Evidence for Sterol-Dependent Regulation in 

RT Macrophages . " ; 

RL Biochem. Biophys. Res. Commun. 281:24 9-258(2001). 

DR EMBL; AF178941; AAG09372.1; -. 

DR EMBL; AF327657; AAK14334.1; 

DR PIR; A59189; A59189. 

DR GO; GO: 0016020; Ctmembrane; IEA. 

DR GO; GO: 0005524; F : ATP binding; IEA. 

DR GO; GO: 0004009; F : ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006118; P:electron transport; IEA. 

DR GO; GO: 0006810; P:transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR000566; Lipocln_cytFABP . 

DR InterPro; IPR000572; Oxidored_molyb . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; ABC_TRANSPORTER_2 ; 2. 

DR PROSITE; PS00022; EGF_1; 1. 

DR .PROSITE; PS00213; LIPOCALIN; 1. 

DR PROSITE; PS00559; MOLYBDOPTERIN_EUK; 1. 

KW ATP-binding. 

SQ SEQUENCE 2436 AA; 269955 MW; E044A3AF14EA25D1 CRC64; 

Query Match 100.0%; Score 12668; DB 4; Length 2436; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 2436; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

QY 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGFLHQLQLLLWKNWLKRRSPWVXAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTj^A <60 



Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 120 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

Qy 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | I I | | | | | | | | | | | 
Db 181 LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 

Qy 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I 
Db 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

Qy 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 

Qy 361 AS GAGGAANGT GAGAVMGPNATAEEGAP SAAALAT P DT LQGQC S AFVQ LWAGLQ P I LCGN 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AS GAG GAANGT GAGAVMG PNATAEE GAP SAAALAT P DT LQGQ C S AFVQ LWAG LQ P I LC GN 420 

Qy 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

Qy 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

Qy 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

Qy 601 NVTVFASVI FQTRKDGSLPPHVHYKI RQNSS FTEKTNEI RRAYWRPGPNTGGRFYFLYGF 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 NVTVFASVI FQTRKDGSLPPHVHYKI RQNSS FTEKTNEI RRAYWRPGPNTGGRFYFLYGF 660 

Qy 661 WIQDMMERAIIDTFVGHDVVEPGS WQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 VWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 

Qy 721 AMTIQHIVAEKEHRLKEVMKTMGLNNAWWVAWFITGFVQLSISVT ALTAI LKYGQVLMH 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMH 780 

Qy 781 S HWI I W L FLAVYAVAT I M FC FL VS VL Y S KAKLAS AC GG 1 1 Y FL S YVP YMYVAI RE EVAH 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 SHWIIWLFLAWAVATIMFCFLVSVLYSKAKI^SACGGIIYFLSYVPYMYVAIREEVAH 84 0 

Qy 841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

Qy 901 MVT)AVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPW7U^TPRLSV 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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1141 
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1201 
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Db 


1261 
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1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 


Qy 


1621 


Db 


1621 


Qy 


1681 


Db 


1681 


Qy 


1741 


Db 


1741 



901 MVDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 



MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 102 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

GSRAI I LDEPTAGVDPYARRAI WDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 1200 

I I IN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GSRAI I LDEPTAGVD PYARRAI WDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKL 12 00 

KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
KCCGS PLFLKGT YGDGYRLTLVKRPAEPGGPQEPGLAS S PPGRAPLS SCSELQVSQFI RK 12 60 

HVAS CLLVSDTSTELS YI LPS EAAKKGAFERLFQHLERS LDALHLS S FGLMDTTLEEVFL 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSS FGLMDTTLEEVFL 1320 

KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I 
KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 1440 

HGLLVKRFHCARRNSKALFSQILLPAFFVCVTVMTVALSVPEI-GDLPPLVLSPSQYHNYTQ 1500 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 
I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | 
PRGNFIPYANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPTLNLS 15 60 

SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I 
SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 1620 

GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 

EYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAP PMVRKIAVRRAAQVFYNNKGYHSM 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 

E YLL FT S DRFRLHRYGAI T FGNVLKS I PAS FGT RAP PMVRKI AVRRAAQVF YNNKG YH SM 1740 

PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 
I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 18 00 



Qy 1801 YAMS FVPAS FWFLVAEKSTKAKHLQFVS GCNP 1 1 YWLAN YVWDMLN YLVPATCCVI I LF 18 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i I 1 1 1 1 1 1 ! 1 1 

Db 1801 VAMS FVPAS FVVFLVAE KST KAKH LQFVS GCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I L F 1860 

Qy 1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1861 VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

Qy 1921 VATFLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1921 VATFLLQLFEHDKDLKVWSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 2040 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 

1981 KSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 204 0 

2041 VLRGDAI)NDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2041 VLRGD7\I)NDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

2101 LTGDESTTGGEAFVNGHS VLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGI SW 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I 
2101 LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 



2161 



2220 



KDEARWKWALEKLELTKY.ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2161 KDE7^WKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 



Qy 2221 ARRFLWNLILDLIKTGRSWLTSHSMEECE7VLCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2221 ARRFLWNLILDLIKTGRSVVLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 

Qy 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

1 1 1 I 1 1 1 1 1 I I 1 1 ! 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I 1 1 1 1 1 1 1 1 I I 1 1 I 1 1 I 1 1 I I I 

Db 2281 YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 

Qy 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 2341 VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

Qy 2401 RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2401 RALVADEPEDLDTEDEGLI SFEEERAQLS FNTDTLC 2436 



PRELIMINARY; 



PRT; 2434 AA. 



RESULT 2 
Q9ESR9 
ID Q9ESR9 
AC Q9ESR9; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ABC2 . 

GN ABC2 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus . 

OX NCBI__TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=20427713; PubMed-10970803 ; 

RA Zhao L., Zhou C, Tanaka A., Nakata M. , Hirabayashi T., Amachi T., 

RA Shioda S., Ueda K. , Inagaki N.; 

RT "Cloning, characterization and tissue distribution of the rat ATP- 

RT binding cassette (ABC) transporter ABC2 /ABCA2 . " ; 

RL Biochem. J. 350:865-872(2000). 

DR EMBL; AB037937; BAB16596.1; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO:0000166; F: nucleotide binding; IEA. 

DR . GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR000566; Lipocln_cytFABP . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l; 1. 

DR PROSITE; PS50893; AB C_T RAN S PORT ER_2 ; 2. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS00213; LIPOCALIN; 1. 

KW ATP-binding. 1 

SQ SEQUENCE 2434 AA; 270925 MW; CD424A9C4F63513F CRC64; 

Query Match 92.6%; Score 11725; DB 11; Length 2434; 

Best Local Similarity 92.8%; Pred. No. 0; 

Matches 2262; Conservative 49; Mismatches 122; Indels 4; Gaps 4; 

Qy 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I II 
Db 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEA- FYTAA 59 

Qy 61 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I : I I I I I I I I I I I I I II I I 
Db 60 PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLNRWEESNLFDPERPSLGSE 119 

Qy 121 LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 

M I I I I I I I I : I I I I II I I I I I I I I | I || : : I I I I I I I I I I I I I I I I I I I 
Db • 120 LEALHQRLEALSSGPGTWESHSARPAVSSFSLDSVARDKRELWRFLMQNLSLPNSTAQAL 17 9 

Qy 181 LAARVDPPEVYHLLFGPSS7VLDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 24 0 

I I I I I I I I I I I I I I I II : I : I I I I I II I I I I : I I I I I I I I I I I I I I I I 
Db 180 LAARVDPSEVYRLLFGPLPDLDGKLGFLRKQEPWSHLGSNPLFQMEELLLAPALLEQLTC 239 

Qy 241 TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 300 

I I I I I I I I I I I : I I : I I I I I I I I I I I I I III: II I : I I I I I I I : I I :: I I I 
Db 240 APGSGELGRILTMPEGHQVDLQGYRDAVCSGQATARAQHFSDLATELRNQLDIAKIAQQL 299 



Qy 



301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 360 



I : I I I I I II I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 300 GFNVPNGSDPQPQAPSPQSLQALLGDLLDVQKVLQDVDVLSALALLLPQGACAGRAPAPQ 359 

Qy 361 AS GAG GAAN GT GAGAVMG PN AT AE E GAP S AAALAT P DT LQ GQ C S AFVQ LWAG LQ P I LC GN 42 0 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 360 AGSPSGPANSTGVGANTGPNTTVEEGTQSPVTPASPDTLQGQCSAFVQLWAGLQPILCGN 419 

Qy 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 420 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEADHVILKANETF 479 

Qy 481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I : I I I I I I I II I 

Db 480 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLHWLQQYVADLRLHPEAMNLSLDELPPA 539 

Qy 541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

II I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 540 LRLDYFSLPNGTALLQQLDTI DNAACGWI QFMS KVSVDI FKGFPDEES I VNYTLNQAYQD 599 

Qy 601 ,NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 600 NVTVFASVIFQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 659 

Qy 661 WIQDMMERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 72 0 

I I I I I I : I I I I I : I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 660 WIQDMIERAIINTFVGHDVVTIPGNWQMFPYPCYTRDDFLFVIEHMMPLCIWISWVYSV 719 

Qy 721 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVLMH 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 720 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTALTAILKYGQVLMH 779 

Qy 781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 

I I I : I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 780 SHVLIIWLFLAWAVATIMFCFLVSVLYSKAK^SACGGIIYFLSWPYMYVAIREEVAH 839 

Qy 841 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 840 DKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTML 899 

Qy 901 MVDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

Ml I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 900 ^DTVVYGVLTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTETWEWSWPWAHAPRLSV 959 

Qy 961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I : I I I I I I I I I | I | | | | | | | 
Db 960 MEEDQACAMESRHFEETRGMEEEPTHLPLWCVDKLTKVYKNDKKLALNKLSLNLYENQV 1019 

Qy 1021 VS FLGHNGAGKTTTMS I LTGL FP PT S GS AT I YGHDI RTEMDE I RKNLGMC PQHNVL FDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I 

Db 1020 VSFLGHNGAGKTTTMS ILTGLFPPTSGSATI YGHDI RTEMDEIRKNLGMCPQHNVLFDQL 1079 

Qy 1081 TVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1080 TVEEHLWFYSRLKSMAQEEIRKEMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 1139 

Qy 1141 GSRAI I LDEPTAGVDPYARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI 1SHGKL 1200 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 



Db 


1140 


Qy 


1201 


Db 


1200 


Qy 


1261 


Db 


1260 


Qy 


1321 


Db 


1320 


Qy 


1381 


Db 


1380 


Qy 


1441 


Db 


1440 


Qy 


1501 


Db 


1500 


Qy 


1561 


Db 


1560 


Qy 


1620 


Db 


1620 


Qy 


1680 


Db 


1680 


Qy 


1740 


Db 


1740 


Qy 


1800 


Db 


1800 


Qy 


1860 


Db 


1860 


Qy 


1920 


Db 


1920 


Qy 


1980 


Db 


1980 



1140 GSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKL 1199 

KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGIASSPPGRAPLSSCSELQVSQFIRK 1260 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1111:1111 II 11:111:11111111 
KCCGSPLFLKGAYGDGYRLTLVKRPAEPGTSQEPGMASSPSGRPQLSNCSEMQVSQFIRK 1259 

HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 132 0 

Nil I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I II I 

HVASSLLVSDTSTELSYILPSEAVKKGAFERLFQQLEHSLDALHLSSFGLMDTTLEEVFL 1319 

KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 1380 

HI I I I I I I I I I I I I II I I II I I : I I I I I I I I I I I I I I I I I I I I I I I I 

KVSEEDQSLENSE7VDVKESRKDALPG7VEGLTAVESQAGNLARCSELAQSQASLQSAS SVG 1379 

SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 144 0 
I I I I I I I I I I I I I I I I I I I II I II I I : I I I I I I I I I : I I I I I I I I I : | I I I : I I I 
SARGDEGAGYTDGYGDYRPLFDNLQDPDSVSLQE7VEMEALARVGQGSRKLEGWWLKMRQF 14 39 

HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | I I I I I I I I I I I I I 

HGLLVKRFHCARRNSK7VLCSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 14 99 

PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKS PANGS LGPTLNLS 1560 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
PRGNFIPYANEERREYRLRLSP DAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPMLNLS 1559 

SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDED-LQAWNVSLPPT 1619 
I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml M I I I 
SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPS PAPSDSPLSPDEDSLLAWNTSLPPT 1619 

AGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNV 1679 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGPETWTWAPSLPRLVHEPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNV 1679 

S E YLLFT S DRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKG YH S 1739 
I I I I I I I I I I I I I I I I I I I I I I I : I I I I I III I I I I I I I I I I III II I I I I I I 
S EYLLFTS DRFRLHRYGAI T FGNI QKS I PAP I GTRT PLMVRKI AVRRVAQVL YNNKGYHS 1739 

MPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFI 1799 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

MPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFI 17 99 

IVAMS FVPAS FWFLVAEKSTKAKHLQFVSGCN PI I YWLAN YVWDMLNYLVPATCCVI I L 1859 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I HI I I I : I I I 
IVAMS FVPAS FWFLVAEK S T KAKHLQ FVS GCN P VI YWLAN YVWDMLN YLVP AT C C 1 1 1 L 1859 

FVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITA 1919 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

FVFDL PAYT S PTN FPAVL S L FLL YGW S I T P IM Y PAS FW FEVP S S AYVFL I VI N L FI G I T A 1919 

TVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDK 1979 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I | | | | | | | | | | | 
TVATFLLQLFEHDKDLKWNSYLKSCFLI FPNYNLGHGLMEIAYNEYINEYYAKIGQFDK 197 9 

MKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQ 2039 
I I I I I I I I I I I I I I I I I III III I I I I I I I I I I I : I I | : | | | | | | | | | | | | | | | | | | 
MKSPFEWDIVTRGLVAMTVEGFVGFFLTIMCQYNFLRQPQRLPVSTKPVEDDVDVASERQ 2039 



Qy 2040 RVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFK 2099 

Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2040 RVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFK 2099 

Qy 2100 MLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGIS 2159 

I I I I I I I I I I I I I I I I I I I I I I : I I I I | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I 
Db 2100 MLTGDESTTGGEAFVNGHSVLKDLLQVQQSLGYCPQFDALFDELTAREHLQLYTRLRGIP 2159 

Qy 2160 WKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDP 2219 

I I I I I : I I : I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2160 WKDEAQWRWALEKLELTKCADKPAGSYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDP 2219 

Qy 2220 KARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRI^IMVNGRLRCLGSIQHLKNRFGD 2279 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2220 KARRFLWNLILDLIKTGRSWLTSHSMEECEAVCTRLAIMVNGRLRCLGSIQHLKNRFGD 2279 

Qy 2280 GYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVS 2339 

I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 228 0 GYMITVRTKSSQNVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEHW 2339 

Qy 2340 GVLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTE 2399 

I I I I I I I I I I I I I I I I I I I I I I I | I | | | | : | | | | | || | INI II I II II I MM 
Db 234 0 GVLGIEDYSVSQTTLDNVFVNFAKKQSDNVEQQEAE-PSTLPSPLG-LLSLLRPRPAPTE 2397 

Qy 2400 LRALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

I I I I I I I II II I I I I I I I I I I I II I I I II I I I I I II I 
Db 2398 LRALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2434 

RESULT 3 
Q96HC2 



ID Q96HC2 PRELIMINARY; PRT; 867 AA. 

AC Q96HC2; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Similar to KIAA1062 protein (Fragment) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Eye; 

RA Strausberg R. ; 

RL Submitted (MAY-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC008755; AAH08755.1; 

DR GO; GO: 0016020; Crmembrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F: nucleotide binding; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR006209; EGF_like. 

DR Pfam; PF00005; ABC tran; 1. 



DR ProDom; PD000006; ABC_transporter; 1. 

DR SMART; SM00382; AAA; 1. 

DR PROSITE; PS50893; ABC__TRANSP0RTER_2 ; 1. 

DR PROSITE; PS00022; EGF_1 ; 1. 

KW ATP-binding. 

FT NON_TER 11 

SQ SEQUENCE 867 AA; 96734 MW; DCF6B6A90074C085 CRC64; 

Query Match 35.7%; Score 4518; DB 4; Length 867; 

Best Local Similarity 99.9%; Pred. No. 3.6e-291; 

Matches 866; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

RFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAP 162 9 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
RFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAP 60 

SLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDR 1689 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I .1 I I I I I I I I I I 
SLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWAGDILTDITGHNVSEYLLFTSDR 12 0 

FRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKG YHSMPT YLN S LNN 174 9 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

FRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNN KG YH SMPT YLN S LNN 180 

AI LRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAI FI IVAMS FVPAS 1809 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDVVIAIFI IVAMS FVPAS 24 0 

FWFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTS 1869 

INN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

FWFLVAEKS T KAKHLQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I L FVFDL P AYT S 300 

PTN FPAVLS LFLLYGWS I T P IMYPAS FWFEVP S S AYVFLI VINLFI GI TATVAT FLLQLF 1929 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

PTN FPAVLSLFLLYGWS IT P IMYPAS FWFEVPSSAYVFLI VI NLFIGITATVATFLLQLF 360 

EHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIV 1989 
I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
EHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIV 420 

TRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADND 2049 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | I I I I I I I I I I I I I I I 
TRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADND 480 

MVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTG 2109 
M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTG 540 

GEAFWGHSV^KELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARVVKW 2169 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I 
GEAFWGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKW 600 

ALEKLELTKYADKPAGTYSGGNKRKLSTAIALI GYPAFI FLDEPTTGMDPKARRFLWNLI 2229 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | II I I I I I I I 
ALEKLELTKYADKPAGTYSGGNKRKLSTAIALI GYPAFI FLDEPTTGMDPKARRFLWNLI 660 



Qy 


1570 


Db 


1 


Qy 


1630 


Db 


61 


Qy 


1690 


Db 


121 


Qy 


1750 


Db 


181 


Qy 


1810 


Db 


241 


Qy 


1870 


Db 


301 


Qy 


1930 


Db 


361 


Qy 


1990 


Db 


421 


Qy 


2050 


Db 


481 


Qy 


2110 


Db 


541 


Qy 


2170 


Db 


601 



Qy 



2230 LDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS 2289 







1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


661 


LDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS 72 0 


Qy 


2290 


SQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSV 2349 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


721 


SQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSV 78 0 


Qy 


2350 


SQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPE 24 09 






1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 II 1 1 M 1 1 1 1 1 1 1 
* ■ * ■ * i ■ i ■ i i i i i ■ i i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 


Db 


781 


SQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVADEPE 840 


Qy 


2410 


DLDTEDEGLI S FEEERAQLS FNTDTLC 2436 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 II 1 1 


Db 


841 


DLDTEDEGLI SFEEERAQLSFNTDTLC 867 



RESULT 4 
Q8UW4 

ID Q8UW4 PRELIMINARY; PRT; 2260 AA. 

AC Q8UW4; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ATP-binding cassette transporter 1. 

GN ABCA1 . 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Attie A.D., Brooks-Wilson A.R., Walker D., McManus B., 

RA Gray-Kellar M.P., MacDonald M.L.E., Roomp K., Tebon A., Zhang L.-H., 

RA Mulligan J., Sensen C, Bitgood J.J., Cook M.E., Kastelein J. J. P., 

RA Hayden M.R. ; 

RT "Cholesterol Ester Accumulation in Hepatocytes and Intestinal Lamina 

RT Propria Caused by an ABCAl Mutation in WHAM Chickens. "; 

RL Submitted (MAR-2 001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF362377; AAL56247.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR003593; AAA_AT Pa s e . 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; AB C_T RAN S PORT ER_2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2260 AA; 254070 MW; 19D137F342F98662 CRC64; 



Query Match 

Best Local Similarity 



33.3%; 
38.9%; 



Score 4214; DB 13; Length 2260; 
Pred. No. 3e-270; 



Matches 982; Conservative 367; Mismatches 719; Indels 458; Gaps 60; 



Qy 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA 60 

M 111111111:11 : | | : I | : I I I | : : I I I I I 

Db 1 MAFWTQLGLLLWKNFTYRRRQTFQLLIEVAWPLFIFFILISVRLSYPPYEQHECHFPNKA 60 

Qy 61 PLTSAGILPVMQSLCPDGQRDEF — GFLQYANSTVTQLLERLDRWEEGNLFD 111 

: I I I I I : I : : | | : | : :: | | 

Db 61 -MPSAGTLPWIQGIICNANNPCFRYPTPGESPGIVGNFNASIV SRLFS 107 

Qy 112 PARPSL GSELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSV7VRNPQELWRFL 166 

I : I :::::: | | : I I I : I : I : : II 

Db 108 DAKRLLLYSQQDTSIKDVQKVLAKLRKLGNSSGLDL KLRDFLVDN ETFSDFL 159 

Qy 167 TQN L S L PN S TAQAL LAARVD P P E V YHLLFGPSSALDSQS- 205 

I : I : I : I : II I I : : I : : I I : I I : : 

Db 160 RHNVSMP S S AVEELLDAEVNLQKVT VS GYRI QLRDLCNS S ALS EFLT I QNRS VAMDS EAF 219 

Qy 206 GLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTCTPGSGELGRILTVPESQK 258 

II : : I III :: : : I I : I 
Db 220 LCTLPKETLHAAELAF-RANLNPLKPLQREIFFNSSLRDLSET 261 

Qy 259 GALQGYRDAVCSGQ7WVRARRFSGLSAELRNQLDVAKVSQQ-LGLDAPNGSDSSPQAPPP 317 

: : I I : : III: : : | : : | I I : I I I 
Db 262 --VEALRDSL GKLVKELLSMKSWSDMRQEVMFLTNVNASNSSTQI 304 

Qy 318 RRLQALL GDLL DAQ KVLQ DVDVL S ALAL L L P Q GACT G RT P G P PAS GAGGAA NG 370 

: I : 1:1 II 
Db 305 YQAVSRIVCGHPEG GGLKIKSLNWYED 331 

Qy 371 T GAGAVMG PN ATAEE GAP S AAALAT P DT LQGQ C S AFVQ LWAGLQPILCGN 420 

I: I 1:1 :: II |: :: : I I : I : I I 

Db 332 NNYKALFGGNSTEDDVTNFYDNSTTP YCNELMKNLESSPLSRIIWRALKPLLIG- 385 

Qy 421 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 4 80 

1:1 I I : ::: : | | | 

Db 38 6 KVLYTPDTPAIRKIMAEVNRTF 407 

Qy 481 AFVGNWHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 

: I : I I I : I : I : I : : I I : I I I I I : : : : : 

Db 408 QELGVFRDLGGMWEEISPKIWTFMESSQEMDLIRTLLKSKALWDLHLPASNWTVEDVARF 4 67 

Qy 541 LRQ--DNFSLPSGMAL — LQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQ 596 

I : : I :|| : : I I :|| |::| : I : : I : I 

Db 468 L S KH P E E FEADN GMVYTWVDAFN ET D RAI QT I S RFMEC VN LDKLE P VAT EVRL I N K S LE- 526 

Qy 597 AYQDNVTVFASVI FQTRKDGS — LPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTG 651 

I : I I : I I I I I I I I I I : I : I I : I : Mill 

Db 527 -LLDERRFWAGWFTEIAPNSTELPQHVKYKIRMDIDNVERTNKIKDGYWDPGPRADPFE 585 

Qy 652 GRFYFLYGFWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLC 711 

I M ::||::|:MI I : : I III Mill MM: III 
Db 586 DMRYWGGFTYLQDWEQAIIRVQTGTE-KKTGVYVQQMPYPCYVDDIFLRVMSRSMPLF 644 

Qy 712 1WISWWSVAMTIQHIVAEKEHRLKEVM 771 

M : I : I I I I : I : I II I II I I I I : I I I : I : I : : I I I : : I : I I I 
Db 645 MTLAWI YSVAVI I KGI VYEKEARLKETMRIMGLDNGI LWLSWFI S SLI PLLMSAGLLVLI 704 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



772 L K YGQ VLMH S HWI I W L FLAVYAVAT I M FC FL VS VL Y S KAK LAS AC GGIIYFLS YVP YM Y 831 

II I :| :| ::::||::: : I I : I I I : I : : I : I I I : I I I I I : I I |:||: 
705 LKMGNLLPYSDPSWFVFLSIFGIVTILQCFLISTVFSRANIJWVCGGIWFTLYLPYVL 764 

832 VAI RE E VAH D K I T AFE - KC I AS LMS T T AFG L G S K Y FAL Y E VAGVG I QWHT F S Q S P VEGD D 890 

II : I I M I : I 1111:1111:1 I I I : I I I : I I : I I 

765 C VAWQDYVSFSLKIFASLLSPVAFGFGCEYFALFEEQGVGVQWDNFFESPLEEDG 819 

891 FNLLLAWMLMVDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSW 950 

: I:: I : I I : : I I I I I : I I I I I : I I I I I I I Mill :: : 
820 FSITTSAVMMLFDTFLYGVMTWYIESVFPGQYGIPRPWYFPFTKSYWFGE-ESQDRQHLH 87 8 

951 PWARTPRLSVMEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNK 1010 

I : I : I I I I I I I I I : I I I I : I I I : I :: 

879 PDQKGP- SEVC KEEEPMHLSLGVSIQNLVKVYRDGKKVAVDG 919 

1011 LS LNL YENQWS FLGHNGAGKTTTMS I LTGLFPPTS GSATI YGHDI RTEMDEI RKNLGMC 1070 
Mil M I : I I I I I I I I I I I I I I I I I I I II I I I I I : I I I I I I : I : I I : I I I : I 
920 LTLNFYEGQI TS FLGHNGAGKTTTMS I LTGLFP PTS GTAFI LGKDI RS ELSTI RQNLGVC 979 



1071 



980 



PQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKR 1130 
I I I I I I I I I I I I I I : I I I : I I I :::::: I I :: | I : I : I : I I I [ I : I 

PQHNVLFDLLTVEEHIWFYARLKGLPEKKVKEEMEQMAMDVGLPHKLKARTSKLSGGMQR 1039 



1131 KLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGD 1190 

11111:111111: : I I I I I I I I I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I : I I I 
1040 KLSVALAFVGGSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADILGD 1099 

1191 RIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPG 1229 

I I I I I I I I I I I I I I I I I I I I I I I I I : : 
1100 RIAIISHGKLCCVG'SSLFLKNQLGTGYYLTLVKKDVDSSLSSCRNSSSTVSYLKKDDSVS 1159 

1230 -GPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGA 128 8 

: I I I I : I I I I I III I I : I : I I : I I I : II 

1160 QSSSDAGLGSDHESDTLTIDVS — AISNLITKHVPEARLVEDIGHELTYVLPYKAAKEGA 1217 

1289 FERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAE 134 8 

I M : : I I : I I : I : : I I I I I : I I I I : : I I I : I | 
1218 FVELFHEIDDRLSDLGISSYGISETTLEEIFLKVADD SGVDA-ETSDGTLP 1267 

1349 GPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGD YRPLFD-- 1402 

II: I : I I II : 

1268 ARRNRRA FGDRQSCLRPFTEDD 1289 

1403 NPQDPDWSLQEVEAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSK7VLF 1459 

•IN: : I : I I : I : I I :: | | : | | I I | | | : | : | | 
12 90 AFDPNDSD-IDPESRETDLLSGMDGKGSYQMKGWKLSQQQFMALLWKRLLIAKRSRKGFF 134 8 

1460 SQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQY-HNYTQPRGNFIPYANEERREYRL 1518 

: I I : I I I I I I : I : : I I I I I I I I I II " : : : I 
1349 AQIVLPAVFVCIALMFSLIVPPFGKYPSLELQPWMYDEQYT FISNDAPE 1397 

1519 RLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLE 1578 

II 1:1: I I |: 
1398 DAGTQKLLDALLNKPGFGTRCM 1419 



1579 SFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREP 1638 

II' II I I I : I I : I I : I I I 

1420 QGHSI pd TPCTVGQKEWTTA-SVPDSVLEI 1448 

1639 VR CTCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYL 1683 

•I III: II II Ml | | | | : : M I : I : I I 

144 9 LRGNWSMENPSPSCECSNEKIKKMLPVCPPGAGGLPPPQREQDTADILQNLTGRNISDYL 1508 

1684 LFTSDRF RLHRYGAITFG NVLKS I PAS FGTRAPPMVRKI A 1723 

: I : 111:1 : I I I:: I I I : I I 

1509 VKTYAQIIGKSLKNKIWVNEFRYGGFSLGARSSHVLP— PSNEVTDATKQVKKILELAQG 1566 

1724 VRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAA 1765 

: : I : : I I I I : I : : : : I I : I I I I I I I I I : I I I : I 
1567 S S GDRFLNNLAS FMKGLDTKNNVKVWFNNKGWHAIAS FLNVINNAI LRANLQQGK-NPSA 1625 

1766 YGITVTNHPMNKTSASLS-LDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAKH 1824 

llllllhll II: : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I 
1626 YGI TAFNH P LNLT KQQL S EVALMTT S VDVLVS I CVI FAMS FVP AS FWFL I QERVS KAKH 1685 

1825 LQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYG 1884 

Mhll I : I I I II I : I I I I II : I I I I : I I I : I I : I I : I I I I I 
1686 LQFI SGVKPVI YWLANFVWDMCNYI VPATLVI 1 1 FI CFQQKS YVS S SNLPVLALLLLLYG 1745 

1885 WSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKVVNSYLKS 1944 

11111:111111 |::||:MI I =111111 : I I I I : I : I I : : I I :| III 
1746 WSITPLMYPAS FVFKIPSTAYVVLTSVNLFIGINGSVATFVLELFTNNK-LNNINDILKS 1804 

1945 CFLIFPNYNLGHGmEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGF 2004 

I II I I : : I I I I : : I I : : : : I : : : I I I I : I I I I I I I I I I I I 
1805 VFLIFPHFCLGRGLIDIWKNQAMADALERFGE-NRFVSPLSWDLVGRNLFT^IAVEGWFF 1863 

2005 LLTIMCQYNFLRRPQRMPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSR 2063 

1:1:: II I :|: : II I I II ||||:: I :|:::| |||:|: : 

1864 LITVLIQYRFFIKPRPVYAKLPPVNDEDEDVNRERQRIISGGGQSDILEIRELTKIYRMK 1923 

2064 KIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKEL 2123 

I I I I I : I : I : I I I II I I I I I I I I I I : I I I II II I I I I I : I I : I : I : I : 
1924 RKPAVDRICVGIPPGECFGLLGVNGAGKSSTFKMLTGDTDVTGGDAFLKGNSILSNI 1980 

2124 LQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKP 2183 

: I I : : I I I I I I I : : I I I I I I : : III: I : : I : I I : I I I I I : I 
1981 QEVHQNMGYCPQFDAVNELLTGREHLEFFALLRGVPEKEVCKVGEWAI RKLGLVKYGEKY 204 0 

2184 AGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK7VRRFLWNLILDLIKTGRSWLTS 2243 

M I I I I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I M I : I I I I I I I I I I 
2041 AGNYSGGNRRKLSTAIALIGGPPWFLDEPTTGMDPKARRFLWNCALSVIKEGRSWLTS 2100 

2244 HSMEECEALCTRIAIMVNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVkDVVRFFNR 2302 

I II I I I II I I I I : I I I I I I I I II I I : I I II I I I I I I I I II : : I I I I 

2101 HSMEECE7VLCTRMAIMVNGRFRCLGSVQHLKNRFGDGYTIWRIAGGNPDLKPVEEFFGH 2160 

2303 NFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFA 2362 

M : : I I I : I Mill I I I I - - I I : I I I I II I I I I I I I I I I I I I I 

2161 AFPGSVLKEKHRNMLQYQLPSSQSSLARIFSVLSQNKKRLHIEDYSVSQTTLDQVFVNFA 2220 



2363 KKQSDN 2368 



Db 



I III: 

2221 KDQSDD 2226 



RESULT 5 
Q80ZB2 



ID Q80ZB2 PRELIMINARY; PRT; 2201 AA. 

AC Q80ZB2; 

DT 01-JUN-2003 (TrEMBLrel. 24, Created) 

DT 01-JUN-2003 (TrEMBLrel. 24 , Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ATP-binding cassette 1. 

GN ABCA1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN^Sprague-Dawley; 

RA Ananthanarayanan M. , Mirza M.F.; 

RT "Cloning and Characterization of Rat Liver Abcal."; 

RL Submitted (DEC-2002) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; AY208182; AA053557.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F : ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS508 93; ABC_TRANSPORTER_2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2201 AA; 246553 MW; B1472978BFC3E6B8 CRC64; 



Query Match 32.2%; Score 4083; DB 11; Length 2201; 

Best Local Similarity 38.9%; Pred. No. 1.5e-261; 

Matches 969; Conservative 332; Mismatches 684; Indels 504; Gaps 63; 

Qy 62 LTSAGILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFD 111 

: I I I I I : I : : I . : : I | : : | | : : | : 
Db 1 MPSAGTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVSRLFSDAQRIL LYS 56 

Qy 112 PARPSLGSELEALR — QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQN 169 

I: : || | :: :: :| I I || : 

Db 57 QKDTS I RDMHKVLRTLQQI KH PNSNLKLQDFLVDNET FSG FLQHS 101 

QY 17 0 LS LPN S TAQALLAARVD P P EVY HLLFGPSSALDSQSGLHKGQEPWSRLGGNPLF 223 

I I I I I I I I I : I : II : I I 
Db 102 LSLPRSAVDNLLQADVSLQKVFLQGYQLHL ASLCNGS- 138 

Qy 224 RMEELLLAPALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGL 283 

I I : : III: I : I I : 



139 KLEEIIR- 



PEDLKVS ALCS LPREKLDAP 165 



284 SAELRNQLDVAK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDA 330 

I I I : : I : I I : I : I I : I III: 

166 ERELRSNMDI LKPVMTKL NSTSLLPTQHLAEATTTLLDSLGGLAQELFSTK 216 

331 QKVL Q'D VD VL S ALAL L L P Q G AC TGRTPGPPAS GAG GAA 368 

I : I : I : I : I : I II 
217 SWSDMRQEVMFLTNVNNSGSSTQIYQAVSRIVCGHPEG GGLKIKSL 262 

369 N GT GAGAVMG PNAT AE E GAP S AAALAT P DTLQGQCSAFVQ — LWAGLQPILCGN 420 

I : I I I I : M l:: I : : : I I : I : I I 

2 63 NWYEDNNYKALFGGNGTEEDTDTFYDNSTTPYCNDLMKNLESSPLSRIIWKALKPLLIG- 321 

421 NRT I EPEALRRGNMS S LGFT S KEQRNLGLLVHLMT SNPKI L YAPAGS EVDRVI LKANET F 480 

I I I I I : | : : | : | | 

322 K I L YT P DT PAT RQVMAEVNKT F 343 

481 AFVGNVTHYAQVWLNI S AEI RS FLEQG RLQQHLRW- LQQY 519 

: :| :| :| :|:| I | | | 

344 QELALFPDLEGMWEELSPQIWTFMESSQEMDLVRPMLDLRGNDQFWERKLDGLYWTAQDI 403 

520 VAELRLHPEAL NLSLDELPPALRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVS 576 

: I I : I I : I I : I : I I : I I : I I I : 

4 04 MAFLAKNPEDVQSPNGSVYTWREAFNETN QAIQTIS RFMECVN 446 

577 VDIFKGFPDEESIVNYTLNQAYQDNVTVFASVIFQ— TRKDGSLPPHVHYKIRQNSSFTE 634 

:: : I I :::| ::: I I : | | II II II! : | 

447 LNKLEPIPTEVTLINKSMD— LLDARKFSAGIDFTGITPDSVELPHHVKDKIRMDIDNVE 504 

635 KTNEI RRAYWRPGPNTG GRFYFLYGFVWIQDMMERAIIDTFVGHDWEPGSYVQMFP 691 

:M:|: II III I I I : : I I : : I : I I I I : : I III I 

505 RTNKIKDGYWDPGPRADPFEDMRYVWGGFAYLQDWEQAIIRVLTGTE-KKTGVYVQQMP 563 

692 YPCYTRDDFLFVT EHMMPLCMVI SWWSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWV 751 

INI I II I : I I I I :: I : I I I I : I : I I I I I I I I I I : I I I : I : I 
564 YPC YVDDI FLRVMSRSMPLFMTLAWI YSVAVI I KS I VYEKEARLKETMRIMGLDNGI LWF 623 

752 AW FI T G FVQLS I S VT ALT AI LK YGQVLMH S HWI I WL FLAVYAVAT I MFC FLVS VL YS KA 811 

: I I I : : I : I I I I I I : I : I : : : : I I : I : I I I I : | | | : | | : | : 
624 SWFI.SSLI PLLVSAGLLVI ILKLGDLLPYSDPSWFVFLSVFAWTILQCFLI STLFSRX 683 

812 KLASACGGIIYFLSYVPYMYVAIREEVAHDKITAFE-KCIASLMSTTAFGLGSKYFALYE 870 

MHMMIII I : I I : I I | | I I I : I I I I I : I I I I : I 

684 NLAAACGGI I YFTLYLPYVLC VAWQ D YVGFS I K I FAS L LS PVAFG FGC E Y FAL FE 738 

871 VAGVG I QWHT FSQ S P VEGDD FNLLLAVTMLMVDAWYGI LTW Y I EAVH P GMYGL P RPW Y F 930 

1 = 1:11 : I I I I I I I I : I : I : : I : I I :: I I I I I II I I I I : I I I I I I 

739 EQGIGVQWDNLFKSPVEEDGFNLTTSVSMMLFDTFIYGVMTWYIEAVFPGQYGIPRPWYF 798 

931 PLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRRFEETRGMEEEPTHLPLV 990 

I llll I I I I I : I : I I | 

799 PCTKSYWFGE EIDEKSHPGS SQKGASEIC MEEEPTHLKLG 838 

991 VCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSAT 1050 

I • I I I I : I l:h: I : I I II I : I I I I I I I I I I I I I I I I I M I I I I I I I : I 
839 VSIQNLVKVYRDGMKVAVDGLALNFYEGQITSFLGHNGAGKTTTMSILTGLFPPTSGTAY 898 



Qy 1051 IYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIED 1110 

I I M I : I I : I I : I I I : I I I I I I I II I I I I I I : I I I : I I I : : : : : : I I : : I I 

Db 899 ILGKDIRSEMNSIRQNLGVCPQHWLFDMLTVEEHIWFYARLKGLSEKHVKAEMEQMALD 958 

Qy 1111 LEL- SNKRHS LVQTLSGGMKRKLSVAIAFVGGSRAI I LDEPTAGVDP YARRAIWDLI LKY 1169 

: I : I I I I I I I : I I I ! I I : I I I I I I : : I I I I I I I I.I I I I : I I I I : I : I I I 
Db 959 VGLPPSKLKSKTSQLSGGMQRKLSVALAFVGGSKVVI LDEPTAGVDP YSRRGIWELLLKY 1018 

Qy 1170 KPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPG 1229 

: I I I I : I I I I I I I I I I : II I I I I II I I I I I I II I I I I I II Mill: I 
Db 1019 RQGRTIILSTHHMDEADILGDRIAIISHGKLCCVGSSLFLKNQLGTGYYLTLVKKDVESS 1078 

Qy 1230 GPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLL 1267 

: I I I I : I I I I I I : I 

Db 1079 LSSCRNSSSTVSCLKKEDSVSQSSSDAGLGSDHESDTLTIDVS — AISNLIRKHVSEARL 1136 

Qy 12 68 VSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQ 1327 

II I I : I : I I I I I I : I I I II :: I I : I I : I : : I I I I I : I I I I : I I 

Db 1137 VEDIGHELTYVLPYEAAKEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFLKVAEE— 1194 

Qy 1328 SLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEG 1387 

111:11 II: 
Db 1195 SGVDA-ETSDGTLP ARRNRR 1213 

Qy 1388 AGYTDVYGDYRPLF DNPQDPDNVSL — QEVEAEALS RV- GQ GS RKLDGGWLKVRQ 1439 

I : I I : I : II:: I : I : I I : I : I I : I I I : I 

Db 1214 A FGDKQSCLHPFTEDDAVDPNDSDLDPESRETDLLSGMDGKGSYQLKGWKLTQQQ 1268 

Qy 1440 FHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYH-NY 1498 

I I I I I I I I : I I : I I : I I I I I I : I : : I I I I I I I I I : I 
Db 12 69 FVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALVFSLIVPPFGKYPNLELQPWMYNEQY 1328 

Qy 1499 TQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLN 1558 

I : : : | I |:|:: III: 
Db 1329 T FVSNDAPE DMGTQELLNALTKDPGFGTRCM 1359 

Qy 1559 LSSGESRLLAARFFDSMCLESFTQGLPLSNEVPPPPSPAPSDSPASPDEDLQAWNVSLPP 1618 

:| I: I I 
Db 1360 EGNPIPN TPC 1369 

Qy 1619 TAG P EMWT SAP S L P RL VRE PVR CTCSAQGTGFS CPSSVGG-HPPQ 1662 

I I I I : I : I : : : : III: II II I I I 

Db 1370 LVGEEDWTTGP-VPQTLMDLFQNGNWTMKNPSPSCQCSSDKIKKMLPVCPPGAGGLPPPQ 1428 

Qy 1663 MRWTGDILTDITGHNVSEYLLFTSDRF RLHRYGAITFG 1701 

: I III ::ll I |:||: I : III : I 

Db 1429 RKQKTADILQNLTGRNNSDYLWTWQIIAKSLKNKVWWEFRYGGFSLGVSDSQALPPS 1488 

Qy 1702 — — NVLKSIP AS FGT RAP PMVRKI AVRRAAQVFYNN KGYH SMPT 1742 

I : I : : I I : : : : | : : | | | | : | : : : 

Db 1489 QEWNAIKQMKKLLKLTKDSSADRFLSSLGR FMTGLDTKN.NVKVWFNNKGWHAISS 1544 

Qy 1743 Y LN S LNN AI L RAN L P K S K GN P AAY G I T VT NH PMN KT S AS LS-LDYLLQGT DWI AI F 1 1 V 1801 

: I I : I I I I I I I I I I : I I : I I I I I I I : I I II: : I I : : : I : I 
Db 1545 FLNVI NNAI LRAN LQKGE-N P S Q YGI TAFNH P LN LT KQQLS EVALMTT S VDVLVS I CVI F 1603 



Qy 1802 AMSFVPAS FVVFLVAEKSTKAKHLQFVSGCNPIIYWLANYWDMLNYLVPATCCV 1861 

I I I I I I I I I I I I I : I : : I I I I I I I : I I : I I I I : I : I I I I I I : I I I I III 
Db 1604 AMSFVPASFWFLIQERVSKAKHLQFICGVKPVIYWLSNFVWDMCNYWPATLWIIFIC 1663 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1862 FDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATV 1921 

I :l I II I : I II III III I: I I II II I:: II: III I Mill I i :| 
1664 FQQKS WSSTNLPVLALLLLLYGWSITPLMYPAS FVFKIPSTAYWLTSVNLFIGINGSV 1723 

1922 ATFLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMK 1981 

I I I : I : I I : : I : I I II I I I I I :: I I I I :: I I : : : : I : : : 
1724 ATFVLELFTNNK-FNDINDILKSVFLIFPHFCLGRGLIDMVKNQAMADALERFGE-NRFV 1781 



1982 



1782 



S P FEWDI VTRGLVAMAVEGVVGFLLT IMCQ YNFLRRPQRMPVSTKPVED- DVDVAS ERQR 2040 
M I I : I I I I I II I I I I 11:1:: III ||: : I: I I II I I I I 

SPLSWDLVGRNLF7\MAVEGVVFFLVTVLIQYRFFIRPRPVKARLPPLNDEDEDVRRERQR 1841 



2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

•I I ll:::|: I I |:|: : | I I I I : I : I : I I I I I I I I I I I I I I I I I II I I 
1842 I LEGGGQNDI LEI KELTKI YRRK RKPAVDRICVGI PPGECFGLLGVNGAGKTSTFKM 18 98 



2101 



1899 



2161 



1959 



2221 



LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 
I I I I : I I : I : I : I : I : : I I : : I I I I I I I : : | | | | | j : : I I I : 
LTGDTAVTRGDALLNKNSILSNIHEVHQNMGYCPQFDAITELLTGREHLEFFALLRGVPE 1958 

KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 
I : : I : I I : I I I II : I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I 
KEVGKVGEWAI RKLGLVKYGEKYASNYSGGNKRKLSTAIALIGGPPWFLDEPTTGMDPK 2018 



ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 
I I I I I I I I : I I I I I I I I I I I I I I II | | | | | | : | | | | | | | 11111:1111111111 
2019 ARRFLWNCALSIIKEGRSWLTSHSMEECEALCTRMAIMVNGRFRCLGSVQHLKNRFGDG 2078 



2281 



2079 



2340 



YMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVS 2339 
I I II I: :l I II II ::|||:| Mill I I I I : : I I : I 
YTIVVRIAGSNPDLKPVQEFFGLAFPGSVLKEKHRNMLQYQLPSSLSSIlARIFSILSQSK 2138 



GVLGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 
I llllllllllll lllllll III: 
2139 KRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2167 



RESULT 6 
035600 

ID 035600 PRELIMINARY; PRT; 2310 AA. 

AC 035600; 

DT 01-JAN-1998 (TrEMBLrel. 05, Created) 

DT 01-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ATP-binding cassette transporter. 

GN ABCA4 OR ABCR. 

OS /Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Cfaniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6; 



RX MEDLINE=97345663; PubMed=9202155 ; 

RA Azarian S.M., Travis G.H.; 

RT "The photoreceptor rim protein is an ABC transporter encoded by the 

:RT gene for recessive Stargardt f s disease (ABCR) . " ; 

RL FEBS Lett. 4 09:247-252(1997). 

RN [2] 

RP SEQUENCE FROM N . A. 

RC STRAIN=C57BL/6; 

RA Azarian S.M., Travis G.H.; 

RL Submitted (JUN-1998) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AF000149; AAC23916.1; -. 

DR MGD; MGI : 109424 ; Abca4 . 

. DR GO;- GO:0005548; F: phospholipid transporter activity; IMP. 

DR GO; GO: 0006649; P : phospholipid transfer to membrane; IMP. 

DR GO; GO: 0007601; P: vision; IMP. 

DR InterPro; IPR003593; AAA_AT Pa s e . 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR005951; Rim_ABC_transpt . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR TIGRFAMs; TIGR01257; rim_protein; 1. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; ABC_T RAN S PORT ER_2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2310 AA; 260207 MW; 8370C6C8A62EF294 CRC64; 



Query Match 31.2%; Score 3951; DB 11; Length 2310; 

Best Local Similarity 36.4%; Pred. No. 9.7e-253; 

Matches 942; Conservative 396; Mismatches 771; Indels 480; Gaps 65; 

Qy 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 

I I I I I : I I I I I I I I I : : |: II II :|: II I I I I I 

Db 1 MGFLRQIQLLLWKNWTLRKRQKIRFVVELVWPLSLFLVLIWLRNANPLYSQHECHFPNKA 60 

Qy 61 PLTSAGILPVMQSL C PDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLF 110 

: I I I : I I : I : I I I I I : I I : I : I : 
Db 61 -MP S AGLL PWLQGI FCNMNN P C FQN PT P GES PGT VSN YNN S ILARVYRDFQELFMD 115 

Qy 111 DPARPSLG SELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLT 167 

I I I . : I I I I : : I : I : I ' : : : : : I II 

Db 116 TPEVQHLGQVWAELRTLSQFMDTLR THPERFAGRGLQIRDILKDEEALTLFLM 168 

Qy 168 QNLSLPNSTAQALLAARVDPPEVTHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEE 227 

: I : I : I I I : : : I : : I I : : 

Db 169 RNIGLSDSVAHLLVNSQVRVEQFAY GVPDLELTD 202 

Qy 228 LLLAPALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAEL 287 

: : I I I : : : I I : I I I I : I : I 

Db 203 IACSEALLQRF 1 1 F S Q RRGAQT VRDAL C P L S QVT LQWIEDTL 244 

Qy 288 RNQLDVAKVSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD-AQWLQDVDVT.SALALL 34 6 

: I I : I I I I I I I : I I I I I : I I : 

Db 245 YADVDFFKLFHVL PTLLDSSSQGINLRFWGGILSDLSPRMQKFIHRPSVQDLLWVS 300 

Qy 347 LPQGACTGRTPGPPASGAGGAANGTGAGAVMGPNATAEEGAPSAAALATPDTLQGQCSAF 406 

I : I I I 



Db 



301 RP 



LLQNGGP 



ETF 312 



Qy 407 VQLWAGLQ P I LC GNNRT I E P EALRRGNMS S LGFT S KEQRN LGL 449 

I I : I : I I I II I I II Ih 

Db 313 TQLMSILSDLLCG YPEG GGS RVFS FNWYEDNNYKAFLGI DSTRKDPAYS YDK 364 

Qy 4 50 LVHLMTSNP KI L YAP AGS EVDRVI LKANET FAFVGNV 4 86 

I : : I I I I I I : I I : : I I I I : I 

Db 365 RTTSFCNSLIQSLESNPLTKIAWRAAKPLLMGKILFTPDSPAARRIMKNANSTFEELDRV 424 

Qy 487 THYAQVWLNISAEIRSFLEQGRLQQHLR WLQQYVAELRLHPEA-LNL 532 

: I : : I I I : : I :: : : I : I I I I 

Db 425 RKLVKAWEEVGPQIWYFFEKSTQMTVI RDTLQHPTVKDFINRQLGEEGITTEAVLNFFSN 484 

Qy 533 SLDELPPALRQDNFSLPSGM7VLLQQLDTIDN7\ACGWIQFMSKVSVDIFKGFPDEE 587 

I:: :| I:: I |:: : :| |: : I I 

Db 485 GPQEKQADDMTS FDWRDI FNITDRFLRLAN QYLECLVLDKFESYDDEV 532 

Qy 588 S I VN YTLNQAYQDNVTVFAS VI FQTRKD — GSLPPHVHYKIRQNSSFTEKTNEIRRAYWR 645 

: I : : : : I I : I I I I E I I fill I I I I : I : II 

Db 533 QLTQRAL S LLEENR — FWAGWFPGMYPWAS S LP PHVKYKI RMDI DWEKTNKI KDRYWD 590 

Qy 646 PGPNTGGRFYFLY GFVWIQDMMERAIIDTFVGHDWEP — GSYVQMFPYPCYTRDDF 700 

II II | | : : | | | : | : | : : : I I I hi INI: II 

Db 591 SGPRADPVEDFRYIWGGFAYLQDMVEQGIVKSQM QAEPPIGVYLQQMPYPCFVDDSF 647 

Qy 701 LFVIEHMMPLCMVISWVYSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQ 760 

: : : I : I I : : I : I II : I I : : I I I I I I I I I : I I : : I I I I I I : I 

Db 64 8 MI I LNRC FP I FlWIiAWI YS VSMTVKGI VLEKELRLKETLKNQGVSNAVI WCTWFLDS FS I 707 

Qy 761 LSISWALTAILKYGQVmHSHWIIWLFIAWAVATIMFCFLVSVLYSKAKLASACGGI 820 

: : : I : II : ■ : I : : I : I I : : I I I : I I I I I I I : I hill I I : I I I : 
Db 708 MALSIFLLTLFIMHGRILHYSDPFILFLFLLAFATATIMQSFLLSTLFSKASLAAACSGV 767 

Qy 821 IYFLSYVPYMYVAIREEVAHDKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHT 880 

III I : I : : : I : : I I I I I : I : I I I I : : I : I I : I : I I 

Db 768 IYFTLYLPHVLCFAWQ DRMTADLKTTVSLLSSVAFGFGTEYLVRFEEQGLGLQWSN 823 

Qy 881 FSQS PVEGDDFNLLLAVTMLMVDAVVYGI LTWYI EAVHPGMYGLPRPWYFPLQKS YWLGS 94 0 

: I I : I I I : I : I I : : I : : : I I : I I : I I I : : I II II I I I I I I I : I I I I I 
Db' 824 IGKSPLEGDEFSFLLSMKMMLLDAALYGLLAWYLDQVFPGDYGTPLPWYFLLQESYWLGG 883 

Qy 941 GRTEAWEWSWPWARTPRLSVMEEDQACA-MESRRFEETRGMEEE PTH 986 

: I : II I : I : I I II 
Db 884 EGCSTREERALEKTEPLTEEMEDPEHPEGMNDSFFE 919 

Qy 987 --LP-LV— VCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGL 1041 

II II Ml I II:: : I I I I I :: I I I I I I I I I I I I : I I I I I I 

Db 920 RELPGLVPGVCVKNLVKVFEPSGRPAVDRLNITFYENQITAFLGHNGAGKTTTLSILTGL 979 

Qy 1042 FP PT S GS AT I YGHDI RTEMDEI RKNLGMCPQHNVLFDRLTVEEHLWFYS RLKSMAQEEI R 1101 

Mill: I I I I I : I : I :: II I II II I : II I II I I : I I : : I I : I I : 
Db 980 LPPTSGTVLIGGKDIETNLDWRQSLGMCPQHNILFHHLTVAEHILFYAQLKGRSWEEAQ 1039 

Qy 1102 REMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRA 1161 

II: I : I I I MM I I I I I I : I I I II I I I I I I I : : : I I I I I : I I I I I : I I : 
Db 1040 LEMEAMLEDTGLHHKRNEEAQDLSGGMQRKLSVAIAFVGDSKVWLDEPTSGVDPYSRRS 1099 



1162 IWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTL 1221 

I I I I : I I I : I I I I :: I I I I I I I I I I I I I I I I I I I hi I I : I I I I I : I I = III 

1100 I WDLLLKYRSGRTI IMSTHHMDEADLLGDRI AI I SQGRLYCSGTPLFLKNCFGTGFYLTL 1159 

Qy 1222 VKR PAEPGGPQ EPGLASSPPGRAPLSSCSELQV SQFIRKHVA 1263 

I : : : : I I : | : : | | : : I I I : I I 

Db 1160 VRKMKNIQSQRGGCEGVCSCTSKGFSTRCPTR — VDEITEEQVLDGDVQELMDLVYHHVP 1217 

Qy 1264 SCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVS 1323 

II I I : : I I : : I : I : I I : I I : I I I I I I I : I I I I I : I I I I : 

Db 1218 EAKLVECIGQELIFLLPNKNFKQRAYASLFRELEETLADLGLSSFGISDTPLEEIFLKVT 1277 

Qy 1324 EEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSAR 1383 

I : : :: I : III : I : I II 

Db 1278 E DAGAG SM FVG GAQQKREQA — GLRHPCSAP — TEKLRQYAQAPHTCSPGQVD 1326 

Qy 1384 GDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGL 1443 

: I : I : I I I . I I : : I 

Db 1327 PPKGQPSPE PEDP GVP FNT GARL I LQHVQAL 1357 

Qy 1444 LVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQY-HNYT 14 99 

MINI I : I : I I : I I I II : I : : : : I I I : I I I I I I II 

Db 1358 LVKRFHHT I RS RKDFVAQ I VLPAT FVFLALMLS 1 1 VP P FGEFPALTLH PWMYGHQYT FFS 1417 

Qy 1500 — QPRGNFIPYANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PANGS LGPTL 1557 

: I III I I : I I 

Db 1418 MDEP NNEHLE- VLADVLLNRPG 1438 

Qy 1558 NLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLP 1617 

I : I I : : I I : I I II :: 

Db 1439 FGNRCLKE — EWLPEYPCINATSWKTPSVSP NIT 1470 

Qy 1618 P TAG P EMWT SAP S L P RL VRE P VRCT C S AQ GT G F SCPSSVGGHPPQMRW-TGDILTD. 1673 

: I I : I I III: II I I I I I : : : I I 

Db 1471 HLFQKQKWTAAHPSP SCKCSTREKLTMLPECPEGAGGLPPPQRTQRSTEVLQD 1523 

Qy 1674 I TGHNVS E YLL FT S DRFRLH — RYGAI T FGNVLKS I PAS 1710 

: I I : I : M : I : I :: I I I I : I I : I I I ' 

Db 1524 LTNRNISDYLVKTYP7VLIRSSLKSKFWVNEQRYGGISIGGKLPAIPISGEALVGFLSGLG 1583 

Qy 1711 - - FGT RAP PMVRK IAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLP 1757 

I : I : : : I : : I I I I : I : : : : I I : I I I I I I : I I 

Db 1584 Q1^VSGGPWREASKEMLDFLKHLETTDNIKVWFNNKGWHALVSFLNV7^NAILRASLP 1643 

Qy 1758 KSKGNPAAYGITVTNHPMNKTSASLS-LDYLLQGTDWIAIFIIVAMSFVPASFWFLVA 1816 

: : : I I I I I I : I : I I I I : I I I : I I : I I I I I I I I I I I : : I : 

Db 1644 RDR-DPEEYGITVISQPLNLTKEQLSDITVLTTSVDAWAI CVIFAMSFVPASFVLYLIQ 1702 

Qy 1817 EKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAV 1876 

I : I I I I I I I I : I I : I I I I I : : I I : : I I I I II I I I I I I I I I : 

Db 1703 ERVTKAKHLQFISGVSPTTYWLTNFLWDIMNYAVSAGLWGIFIGFQKKAYTSPDNLPAL 1762 

Qy 1877 LSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLK 1936 

: I I : I I I I : : I : I I I I I I - I I I I I : II I I I I I I I I : : I I : I : I I I : : : I 

Db 1763 VSLLMLYGWAVIPMMYPASFLFEVPSTAYVALSCANLFIGINSSAITFVLELFENNRTLL 1822 



Qy 

Db 



Qy 1937 WNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAM 1996 

I : I : : : I I : : II I I : : : I : : : : I I : I : : : I I : I I : : : I II I 
Db 1823 RFNAMLRKLLIVFPHFCLGRGLIDLALSQAVTDVYAQFGE-EYSANPFQWDLIGKNLVAM 1881 

Qy 1997 AVEGWGFLLTIMCQYNF LRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMV 2051 

I : I I I I I I I I : : I : : I : I I II : : I I I I I I I I I : I I : : 

Db 18 82 AIEGWYFLLTLLIQHHFFLTRWIAEPAREPV FDEDDDVAEERQRVMSGGNKTDIL 1937 

Qy 2052 KIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGWGAGKTSTFKMLTGDESTTGGE 2111 

I : I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I : I I I I I I I I : I I : 

Db 1938 KLNELTKVYSGSSSP AVDRLCVGVRPGECFGLLGVNGAGKTTTFKMLTGDTTVTSGD 1994 

Qy 2112 AFWGHSVLKELLQVQQSLGYCPQCDALFDELT7VREHLQLYTRLRGISWKDEARWKWAL 2171 

I I I I : I : I I :: I I I I I I I : I I I I I I I I I I I I I : I : : I I : 
Db 1995 ATVAGKSILTSISDVHQNMGYCPQFDAIDDLLTGREHLYLYARLRGVPSKEIEKVANWGI 2054 

Qy 2172 EKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILD 2231 

: I I : III: I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I : I I I III I : 
Db 2055 QSLGLSLYADRLAGTYSGGNKRKLSTAIALTGCPPLLLLDEPTTGMDPQARRMLWNTIVS 2114 

Qy 2232 LIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQ 2291 

: I : I I : I I I I I I I I I I I I I I I I I I I I I I I : I I I : I I I I I : I I I I I : : I : : II : 
Db 2115 IIREGRAWLTSHSMEECEALCTRLAIMVKGTFQCLGTIQHLKYKFGDGYIVTMKIKSPK 2174 

Qy 2292 SVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIED 2346 

: I : I I I I I : : : I I I I : : I : I : I I I I :: I : III: 
Db 2175 DDLLPDLNPVEQFFQGNFPGSVQRERHHSMLQFQVPSS — SLARIFQLLISHKDSLLIEE 2232 

Qy 2347 YSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVAD 2406 

I I I : I I I I I I I I I I I I : I : : : I I I : I : I : : 

Db 2233 YSVTQTTLDQVFVNFAKQQTETYDLP LHPRAAGASWQAKLEE 2274 

Qy 2407 EPEDLDTED 2415 

: I I:: 
Db 2275 KSGRLQTQE 2283 



RESULT 7 
002698 

ID 002698 PRELIMINARY; PRT; 2281 AA. 

AC 002698; 

DT 01-JUL-1997 (TrEMBLrel. 04, Created) 

DT 01-JUL-1997 (TrEMBLrel. 04, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ABC transporter. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Retinal rod cell; 

RX MEDLINE=97248596; PubMed=9092582 ; 

RA Illing M., Molday L.L., Molday R.S.; 

RT "The 220-kDa rim protein of retinal rod outer segments is a member of 

RT the ABC transporter super f amily. " ; 



RL J. Biol. Chem. 272:10303-10310(1997). 

DR EMBL; U90126; AAC48716.1; 

DR GO; GO: 0005887; C: integral to plasma membrane; IEA. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR -GO; GO:0004009; F : ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR005951; Rim_ABC_transpt . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR TIGRFAMs; TIGR01257; rim_protein; 1. 

DR PROSITE; PS50893; ABCJTRANSPORTER_2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2281 AA; 257228 MW; 71CD404C98F7A079 CRC64; 



Query Match 30.7%; Score 3893.5; DB 6; Length 2281; 

Best Local Similarity 36.3%; Pred. No. '6.3e-249; 

Matches 937; Conservative 390; Mismatches 772; Indels 483; Gaps 70; 

Qy 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA 60 

III I : : I I I I I I I I : : I I : I I I I : I : I I I I I I I 

Db 1 MGFARQIKLLLWKNWTLRKRQKIRFWELWPLSLFLVLIWLRNVNPLYSKHECHFPNKA 60 

Qy 61 PLTSAGILPVMQSL-C PDGQRDEFG FLQYANSTVTQLLERLDRWEEGNLF 110 

: I I I : I I : I : I II I I I I : I , I : I : I : 
Db 61 -MPSAGMLPWLQGIFCNVNNPCFQSPTAGESPGIVSNYNNS ILARVYRDFQELLMD 115 

Qy 111 DPARPSLGSELE7\LRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNL 170 

I II II II I I : I : | :: : | | | : | : 

Db 116 APESQHLGQVWRELR TLSQLMNTLRMHPERIAGRGI RI REVLKDDEMLTLFLVKNI 171 

Qy 171 SLPNSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLL 230 

I :| I: ::| I I ::| I I :::: 

Db 172 GLSDSWYLLVNSQVRP EQFAR — GVPDLMLKDIAC 205 

Qy 231 APALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQ 290 

: I I I I I I I I : I I I :: I I : I 

Db 206 SEALLE RFLIFP — QRRAAQTVRGSLCSLSQGT LQWMEDTLYAN 247 

Qy 291 LDVAKVSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD-AQKVT.QDVTVXSALALLLPQ 349 

: I I : I III I : I I : I : : I I : I 

Db 248 VDFFKLFHVF PRLLDSRSQGMNLRSWGRILSDMSPRIQEFIHRPSVQDLLWVTRP- 302 

Qy 350 GAC T GRT P G P PAS GAGGAAN GTGAGAVMG PNAT AEEGAP S AAALAT P DTLQ GQ C S AFVQ L 409 

: I I III 
Db 303 LVQTGGP ETFTQL 315 

Qy 410 WAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRN LGL 449 

I : I II II I III II: 

Db 316 MGILSDLLCG YPEG GGSRVFSFNWYEDNNYKAFLGIDSTRKDPIYSYDERTT 367 

Qy 450 LVHLMTSNP KI LYAPAGSEVDRVI LKANETFAFVGNVTHY 4 89 

I : : I I I I I I : I I : : I I I I : I 



Db 



368 TFCNALIQSLESNPLTKIAWRAAKPLLMGKILFTPDSPATRRILKNANSTFEELERVRKL 42 7 



Qy 490 AQVWLNI SAEI RS FLEQGRLQQHLR WLQQYVAELRLHPEALNLSLDELP 538 

:| I : :| I :: :| | :| : | : I |: |: | 
Db 428 VKVWEEVGPQIWYFFDKSTQMSMIRDTLENPTVKAFWNRQ-LGEEGITAEAV LNFLY 483 

Qy 539 PALRQDNFSLPSGMALLQQLDTIDNAACGW IQFMSKVSVDI FKGFP 584 

I : 11:11 I I : : : : I I : : 

Db 484 NGPREG QADDVDN — FNWRDI FNITDRALRLANQYLECLI LDKFES YD 529 

Qy 585 DEESIVNYTLNQAYQDNVTVFASVI FQTRK — DGSLPPHVHYKIRQNSSFTEKTNEIRRA 642 

II : I : : : : I I : I I I I I I I I I I I : Ml I : I : 

Db 530 DEFQLTQRALSLLEENR— FWAGWFPDMHPWTSSLPPHVKYKIRMDIDWEKTNKIKDR 587 

Qy 643 YWRPGPNTGGRFYFLY GFVWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDD 699 

II II II I I : : I I I : I I : : | | |:| ||||: | 

Db 588 YWDSGPRADPVEDFRYIWGGFAYLQDMVEHGITRS-QAQEEVPVGI YLQQMPYPCFVDDS 64 6 

Qy 700 FLFVIEHMMPLCIWISWWSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITG^ 759 

I: I : I I : : I : I I I : I I : : |-| I I I I I I I : I |::| I I IM I 

Db 647 FMIILNRCFPIF1WLAWIYSVSMTVKSIVLEKELRLKETLKNQGVSNRVIWCTWFLDSFS 706 

Qy 760 QLSISVTALTAILKYGQVLMHSHWIIWLFLAVYAVATIMFCFLVSVLYSKAKIASACGG 819 

• I : I : II : : | : : | : | : I : : I I I : : : I I I I I I I : I : ! : I I I : I I I 
Db 707 IMSMSICLLTIFIMHGRILHYSNPFILFLFLLAFSIATIMQCFLLSTFFSRASLAAACSG 766 

Qy 820 1 1 Y FLS YVP YMYVAI RE EV7VH D K I T AFEKC I AS IMS T TAFGLG S K Y FAL Y EVAGVGI QWH 879 

MM I : I : : : hill I I I : I I I I I : : I I : I I I I : I I 

Db 767 VIYFTLYLPHILCFAWQ DRITADMKMAVSLLSPVAFGFGTEYLAXFEEQGVGLQWS 822 

Qy 880 TFSQSPVEGDDFNLLLAWML1WDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLG 939 

11:111:1: I::: |:::|| :||:| II:: I II II I I I I I f I : I I I I I 
Db 823 NIGNSPMEGDEFSFLMSMKMMLLDAALYGLLAWYLDQVFPGDYGTPLPWYFLLQESYWLG 882 

Qy 940 SGRTEAWEWSWPWARTPRLSVMEEDQACA-MESRRFEETRGMEEE 983 

: I : II I : I : I I 

Db 883 G EGCSTREERALEKTEPITEEMEDPEYPEGINDCFF 918 

Qy 984 PTHLP-LV — VCVDKLTKVYKDDKKLALNKLSLNLYENQVVS FLGHNGAGKTTTMSILTG 104 0 

MM III II::: : I : : : I : : II : I : : II II II M II I I : II : I I 
Db 919 ERELPGLVPGVCVKNLVKIFEPYGRPAVDRLNITFYESQITAFLGHNGAGKTTTLSIMTG 978 

Qy 1041 LFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEI 1100 

I I II II : : I II I : I I I : : II II I I II : M I I I I I : I I :: I I : : I 
Db 979 LLPPTSGTVLVGGKDIETNLDAI RQSLGMCPQHNILFHHLTVAEHILFYAQLKGRSWDEA 1038 

Qy 1101 RREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARR 1160 

: II: MM I MM : II I I : : I I I I I I I II I I : : : : I I I I M I I I I I : I I 
Db 1039 QLEME7^1LEDTGLHHKRNEEARDLSGGVQRKLSVAIAFVGDAKVWLDEPTSGVDPYSRR 1098 

Qy 1161 AIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLT 1220 

: I M I : I I I : I I I I : : I I I I I I I I I : I I I I I I I I I Ml I M M M I : I M II 
Db 1099 SIWDLLLKYRSGRTIIMSTHHMDEADILGDRIAIISQGRLYCSGTPLFLKNCFGTGFYLT 1158 

Qy 1221 LVKRP AEPGGPQEPGLASSPPG RAPLSSCSEL QVSQFIRKHV 1262 

I M I 1:1 : : I II MM : : : : I I 

Db 1159 L VRRMKT I Q S Q G RG REAT C S CAS KG FS VRC P — ACAEAI T P EQVL DG DVN E LT DMVH H H V 1216 



Qy 1263 ASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKV 1322 

II I I : : | I : : | : | : I I : I I : I I I | | | | : I I I I I : I I I I 

Db 1217 PEAKLVECIGQELIFLLPNKNFKQRAYASLFRELEETIADLGLSSFGISDTPLEEIFLKV 1276 

Qy 1323 SEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSA 1382 

:|: I :: |::: | || |: : |: | 

Db 1277 TEDLDSGHLFAGGTQQKRENI — NLRHPCSG PSEKAGQTPQGSSSH 1320 

Qy 1383 RGDEGAGYTDVYGDYRPLFDNPQDPDWSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHG 1442 

I : I II I II: I: I I : 

Db 1321 PGEPAA HPEGQPPP EREGHSRLNSGAR LIVQHVQA 1355 

Qy 1443 LLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQY-HNYTQP 1501 

I I I I I I I : I : | I : I | I | I : | : : : | : | I : I I I I I I I 

Db 1356 LLVKRFQHTIRSHKDFLAQIVLPATFVFLALMLSLIIPPFGEYPALTLHPWMYGQQYT — 1413 

Qy 1502 RGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSS 1561 

II : : I I :: I 
Db 1414 FFSMDQPDSEWLSAL ADVLVNKPG 1437 

Qy 1562 GESRLL7UVRFFDSMCL-ESFTQGLPLSNFVPPPPSPAPSDSPASPDE DLQAWNVSL 1616 

I : I I I : II I II III II 

Db 1438 FGNRCLKEEWLPEFPCGN SSPWKTPS VSPDVTHLLQQQKWTADQ 1481 

Qy 1617 PPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGF SCPSSVGGHPPQMRW-TGDILT 1672 

I : I I III: II I I I I I : : : I I 

Db 1482 P SPS CRCSTREKLTMLPECPEGAGGLPPPQRIQRSTEILQ 1521 

Qy 1673 DITGHNVSEYLLFT SDRFRLH— RYGAITFGNVLKSIPAS 1710 

I : I I I I : : I : I : I : : I I I I : I I : I : 

Db 1522 DLTDRNVSDFLVKTYPALIRSSLKSKFWVNEQRYGGISVGGKLPAPPFTGEALVGFLSDL 1581 

Qy 1711 FGTRAPPMVRKIAVRRAA --QVFYNNKGYHSMPTYLNSLNNAILRANL 1756 

I I I : I I : | : : | | | | : | : : : : | | : I I I I I I : I 

Db 1582 GQLMNVSGGPMTREAAKEMPAFLKQLETEDNIKW 1641 

Qy 1757 PKS KGN PAAYGI TVTNH PMNKT S AS LS - LDYLLQGTDWI AI FI I VAMS FVPAS FWFLV 1815 

I I I I I I I I I : I : I I II: I I I : I I : I I I I I I I I I I I : : I : 

Db 1642 HKDK-NPEEYGITVISQPLNLTKEQLSEITVLTTSVDAWAI CVIFAMS FVPAS FVLYLI 1700 

Qy 1816 AEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPA 1875 

I : 1111111111:1 I I I I : : I I : : I I I I II I I I I I III 
Db 1701 QERVNKAKHLQFVSGVSPTTYWLTNFLWDIMNYTVSAALWGIFIGFQKKAYTSSENLPA 1760 

Qy 1876 VLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDL 1935 

: : : I : I I I I : : I : I I I I I I I : : I I : I I I I I I I I I I :: I I : I : I I I ::: I 
Db 1761 LVAL LML YGWAVI PMMY PAS FL F D I P S T AYVAL S CAN L F I G I N S S AI T FVL E L FENN RT L 1820 

Qy 1936 KWNS YLKSCFLI FPNYNLGHGLMEMAYNEYINEYYAKI GQFDKMKS PFEWDI VTRGLVA 1995 

: I : I : : I I I : : I I I I : : : I : : : : I I : I : : I I : I I : : : I I 

Db 1821 LRINAMLRKLLIIFPHFCLGRGLIDLALSQAVTDVYAQFGEAHS-SNPFQWDLIGKNLAA 1879 

MAVEGVVG FLLT IMCQYN FL RRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDM 2050 

I II I I I I I I I I : : I I I I : I : : : : I I I I I I I I : : I I : 

MAVEGWYFLLTLLIQYQFFFSRWTTEPAKEPIT DEDDDVAEERQRI I S GGNKTDI 1935 



Qy 1996 
Db 1880 



Qy .2051 VKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGG 2110 

: : : I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I : I I I I I I I I : I I 

Db 1936 LRLNELTKVYSGTSSP AVDRLCVGVRPGEC FGLLGVNGAGKTTT FKMLTGDTAVT S G 1992 

Qy 2111 EAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWA 2170 

: I I I I : I : I I I : I I I I I I I : I I I I I I I I I I I I I : : : II I : 
Db 1993 DATVAGKSILTNISDVHQSMGYCPQFDAIDDLLTGREHLYLYARLRGVP7VEEIERVTNWS 2052 

Qy 2171 LEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLIL 2230 

: : I I : III: I I I I I I I I I I I I I I I I I I II I : I I I I I I I I I I : I I I III I : 
Db 2053 IQSLGLSLYADRLAGTYSGGNKRKLSTAIALIGCPPLVLLDEPTTGMDPQARRMLWNTIM 2112 

Qy 2231 DLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSS 2290 

: I : I : I I I I I I I I I I I I I I I II I I II I I : I I I : I I I I I : : I I I II : : I : : : I 
Db 2113 GIIREERAWLTSHSMEECEALCTRLAIMVKGAFQCLGTIQHLKSKFGDGYIVTMKIRSP 2172 

Qy 2291 Q SVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIE 2345 

: : I : I I I I I : : : I I I : : I : I : I I I I : : I : III 

Db 2173 KDDLLPDLGPVEQFFQGN-FPGSVQRERHYNTLQFQVSSS— SLARIFRLLVSHKDSLLIE 2230 

Qy 2346 DYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGCLLSLLRPRSAPTELRALVA 2405 

: I I I : I I I I I I I I I I I I : I : : : I I I : I : I 

Db 2231 EYSVTQTTLDQVFVNFAKQQNETYDLP LH P RTAGAS RQAKEV 2272 

Qy 2406 DE 2407 

I : 

Db 2273 DK 2274 

RESULT 8 
Q7TNJ2 



ID Q7TNJ2 PRELIMINARY; PRT; 2170 AA. 

AC Q7TNJ2 ; 

DT 01-OCT-2003 (TrEMBLrel. 25, Created) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ATP-binding cassette transporter sub- family A member 7. 

GN ABCA7 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Wister; TISSUE=Platelet ; 

RA Sasaki M., Nada S., Yamaguchi A.; 

RT "Cloning of rat ABCA7 . " ; 

RL Submitted (DEC-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AB097814; BAC81426.1; 

KW ATP-binding. 

SQ SEQUENCE 2170 AA; 237718 MW; 003C8DF70B8744CE CRC64; 



Query Match 29.1%; Score 3681; DB 11; Length 2170; 

Best Local Similarity 35.8%; Pred. No. 7.8e-235; 

Matches 911; Conservative 369; Mismatches 743; Indels 520; Gaps 71; 



Qy 



1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 



II 111111111:111 I I : I I I I I I I :: I I : I I 
1 MAFCTQLMLLLWKNYTYRRRQPIQLWELLWPLFLFFILVAVRHSHPPLEHHECHF-PNK 59 



61 PLTSAGILPVMQSLCPDGQRDEF GFL-QYANSTVTQLLERLDRWEEGNLF 110 

I I I I I : I : I I : I II : : | : : : I I I : I : 

60 PLPSAGTVPWLQGLVCNVNNSCFQHPTPGEKPGVLSNFKDSLISRLLADAHTVL-GGHST 118 

111 DPARPSLGSELEALRQHLEALSAG— PGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQ 168 

: I I : I I I : : I I I : .: I : II : I 

119 QDMLAALGKLI PVLR AVGSGAWPQESNQPAKQGSVT ELLEKILQ 162 

169 NLSLPNSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEEL 228 

II II I : | : : : | | 

163 RAS LETVLGQA QDSMRKFSDATRTVA QEL 191 

229 LLAPALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELR 288 

I I : I : I I : I I I : I : : I : I I 
192 LTLPSLV ELRALLRRPRGSAGSLELISEALCS 223 

289 NQLDVAKVSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLP 34 8 

224 223 

349 Q G AC T G RT PGP PAS GAG GAAN GT GAG AV MGPNATAEEGAPSAAALATPD-TLQGQCS 404 

I I I : I I : I I : III III: I I : I II 

224 TKGPSSPG-GLSLNWYEANQINEFMGP ELAPT LPDSSLSPACS 265 

405 AFV QLWAGLQP I LCGNNRT I E PEALRRGNMS S LGFTS KEQRNLGLLVHLM 454 

II | | | : | : : | 
266 EFVGALDDHPVSRLLWRRLKPLILG 290 

455 T SN P KI L YAP AG S EVDRVI LKAN ET FAFVGN VTH YAQVWLN I S AE I RS FLEQ GRLQ 510 

I I I : I I : : : : : | : | | : : : : | : : | : | : | | 

291 KILFAPDTNFTRKLMAQWQTFEELALLRDLHELWGVLGPQIFNFMNDSTNVAMLQ 346 

511 QHL RWLQQYVAELRLHPEALNLSLDELPPALRQDNF-SLPSGMALLQQLDTIDNA 564 

: I III : : I I : I I I : : I : : I I : : 
347 KLLDVEGTGW-QQQTPKGQKQLEAIR D FL D P S RG R YNWQ EAHADMGRLAE I 396 

565 ACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFASVI FQTRK DGSLPP- 620 

I I : I I : I : I I I : : I : I : : : I : : I : : I I I I 
397 -LG — QILECVSLDKLEAVPSEEALVSRALELLGERR — LWAGIVFLSPEHPLDSSEPPS 451 

621 HVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLY GFVWI QDMMERA 670 

I : III: : I I : I I : I I I I : I I I I : : I I : : I : I 

452 PTTTGPGHLRVKIRMDIDDVTRTNKIRDKFWDPGPSADPLMDLRYVWGGFVYLQDLLEQA 511 

671 IIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVI SWVYSVAMTIQHIVAE 730 

: I I I I : I I : I I I I II h : I I : : : I : I I I I : I : : : I I 

512- AVRVLSGRD-SRAGLYLQQMPHPCYVDDVFLRVLSRSLPLFLTLAWIYSVALTVKAWRE 570 

731 KEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSIS 790 

I I I I : I I : I I I : I I I : I I :: :| I : I I I : | : | | | : : : | | | 

571 KETRLRETMRAMGLSRAVLWLGWFLSCLGPFLVSAALLVLVLKLGNILPYSHPVWFLFL 630 

791 AVYAVATIMFCFLVSVLYSK7VKLASACGGIIYFLSYVPY-MYVAIREEVAHDKITAFEKC 849 
I : I I I I : I I : I : I : I I I : I I I I : I I I : I I : I I I I : : | 



631 AAFAVAT VAQ S FLL S AFF S RAN LAAAC GG LAY FAL YL P YVL C VAWRE RL P L GGL LA 68 6 



850 IASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTMLMVDAWYGI 909 

I I : I INI: III I I I I I I I I hi I : : I I I : I I : 

687 -VSLLSPVAFGFGCESLALLEEQGDGAQWHNLGTGPAE-DVFSLAQVSAFLLLDAVIYGL 744 

910 LTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQAC7\M 969 

I I : I I I I I I I : I I I I I :: I I I I I I : II: II 

745 ALWYLEAVCPGQYGIPEPWNFPFRRSYWCGPG PPKSSVL APAP 787 

970 ESRRFEETRGMEEEPTHLPLVVCVDKLTKVYKDDKKLALNKLSLNLYENQVVS FLGHNGA 1029 

: : : I I I I I : I I : : : I I I : I : I I : : I II I I I I 

788 QDPKVL VEEPPPGLVPGVSIRGLKKHFRGSPQPALRGLNLDFYEGHITAFLGHNGA 843 

1030 GKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFY 1089 
I I I I I : I I I : II I I I : I I I I : I |||::| I II : I I : I I I : I I I I I lllllhlll 
84 4 GKTTTLSILSGLFPPSSGSASILGHDVQTNMAAIRPHLGICPQYNVLFDMLTVEEHVWFY 903 

1090 SRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDE 1149 
Ml:: I I : : | | : | I I : : I I I I I : I I I I I I I I I I I I I I :|:|| 
904 GRLKGVSAAAIDSEQEHLIRDVGLIPKRDTQTRHLSGGMQRKLSVAIAFVGGSRWIMDE 963 

1150 PTAGVDP YARRAI WDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKLKCCGS PLFL 1209 
M I I II I : II I I : I : I II : I I I : : II M I : II' I : II I II : I : : : I I I I I I I I II 
964 PTAGVDPASRRGIWELLLKYREGRTLILSTHHLDEAELLGDRVAMVASGSLCCCGSPLFL 1023 

1210 KGTYGDGYRLTLVK RPAEPG GPQEPGLAS 1238 

: I I I I II II I : I II 

102 4 RRHLGCGYYLTLVKSSQSLVTHDLKGDTEDPRREKKSGSEGKTADTVLTRDGPHRSSQVP 1083 

1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 12 98 

M I : : : I : : : : : I I I I : I | | | | Ml Mil:: 

1084 APDA-VPVTPSAAL-ILELVQRHVPGAQLVEELPHELVLALPYAGALDGSFATVFQELDQ 1141 

1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I : I I : M : I I I I I : I II I I I : : I : : : I | I I 
1142 QLERLGLTGYGISDTNLEEIFLKWEEAHA — HGEGGDPRQQQHLLTATPQP HTG 1194 

1359 NIARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAE 1418 

II I : | : || | : : : 
1195 PEASVLE NGELAKLVLDPQAPKGSAPTTAQVQ 1226 

1419 ALSRVGQGSRKLDGGW-LKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVAL 1477 

III : I I I I I I I I I : : I I : I I : I II II : I : I 
1227 GWTLTCQQLR7VLLHKRFLLARRSRRGLFAQIVLPALFVGLALFFTL 1272 

1478 SVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPS 1537 

II I III III: I | : : : | : I I : : : I 

1273 IVPPFGQYPPLQLSPAMY GPQVSFFSED APADPNRM KLLE 1312 

1538 GVGATCVLKSPA NGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPP 1594 

: I : I : III I III I I : I I 

1313 ALLGEAGLQDPSVQGKGSRGSECTHS LACYFTVPEVPPDVASILASGNWTPDSP 1366 

1595 SPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGF S 1651 

Ml I M I 

1367 SPA CQCSQPGARRLLPD 1383 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



1652 CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLH RYGAIT 1699 

II: II IN I : : : : : | | | | | : : | : | | | | : 

1384 C P AGAGG P P P PQAMAG FGE WQN LT GRNVS DFL VKT Y P S LVRRGLKT KKWVDEVRYGGFS 14 4 3 

1700 FGNVLKS I PAS FGTRAPPMVRKI A VRRAAQV 1730 

I : I : : I I : I I : : : 

1444 LGGRDPDLPS GREWRTVAEMRALLSPQPGNTLDRILNNLTQWALGLDARNSLKI 1498 

1731 FYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLL-Q 1789 

: : I I I I : I : I :: I I I : I I I I I I I : I I I I I : I I II I : 

1499 WFNNKGWHAMVAFVNRANNGLLRAFLP-SGSVRHAHSITTLNHPLNLTKEQLSEATLIAS 1557 

1790 GTDWIAIFIIVAMSFVPAS FVVFLVA£KSTKAJ<HLQFVSGCNPII YWL7VNYVWDMLNYL 184 9 

I I : : : I : : I I I I I I I I I : I : I : I : I I I I I III : I I I I : : I I I II I 
1558 SVDVLVS I CWFAMS FVPAS FTLVLI EERI TRAKHLQLVS GLPQTL YWLGNFLWDMCN YL 1617 

1850 VPATCC VI I L FV- FDL PAYT S PTN FP AVL S L FLL YGWS I T P IMY PAS FW FEVP S S AYVFL 1908 

I I I I : : : I : I II : I I I I : I I I I I I I I I I I : I I I I I I : I I I I : I I I I 
1618 V-AVCIWLIFLAFQQKAYVAPENLPALLLLLLLYGWSITPLMYPASFFFSVPSTAYWL 1676 

1909 IVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYIN 1968 

I I I I I I I : : : I I I : I : I I : : I : I : II I I I I I : : I I I I : : I I : : 
1677 TCINLFIGINSSMATFVLELLS-DQNLQEVSRILKQVFLIFPHFCLGRGLIDMVRNQAMA 1735 



1969 



1736 



2029 



EYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPV 2 02 8 
: : : : I : : I I ' I I I : : I : I I : I : I : I : : I : II II I 

DAFERLGD-KQFQSPLRWDIIGKNLLAMVAQGPLFLLITLLLQHRNRLLPQ--PKSRLPP 1792 



— EDDVDVAS ERQRVLRGDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVRPGEC FG 2085 
1:111 I I : I I : I I : : : : I I II I : : I I I I I I I I : I I I I I I 
17 93 PLGEEDEDWRERERVTKGATQGDVLVLRDLTKVYRGQ RSPAVDHLCLGIPPGECFG 184 9 

2086 LLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTA 2145 

I I I I I I I I I I I I I : I : I I I : III : I I : I : I : I : I I I I I I I : I I II 

1850 LLGVNGAGKTSTFRMVTGDTLPSSGEAVLAGHNVAQEPSAAHRSMGYCPQSDAIFDLLTG 1909 

2146 REHLQLYTRLRGI SWKDEARVVKWALE KLELTKYADKPAGTYSGGNKRKLSTAIALI 2202 

I I I I : I : I M I : I I : I : I I : I I I I I : I I I I I I I I I I I I I : I I : I I : 
1910 REHLELFARLRGV P EAQVAQTAL S GLVRLGL P S YADR P AGT Y S G GN KRKLATALALV 1966 

2203 GYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNG 2262 

I M : I I I I I I I I I I I I I I I I I I : I ::: I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1967 GDPAWFLDEPTTGMDPSARRFLWNNLLSWREGRSWLTSHSMEECEALCTRLAIMVNG 2026 

2263 RLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQL- 2321 

I I I I I I I I I I : I I I I : : I : I I : : I I I : I I : I I : : : : : I I 
2027 RFRCLGSAQHLKSRFGAGHTLTLRVPPDQP-EPAIAFIVTTFPDAELREVHGSRLRFQLP 2 085 

2322 KSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDNLE--QQETE 2375 

: I I : I I : : I : I I : I I I I I I I : I i : I : I I : I I I I I 

2086 PGGGCTLARVFRELAAQGKAHGVEDFSVSQTTLEEVFLYFSKDQGEEEEGSGQETETREV 2145 

2376 PPSALQSPLGCLLSLLRPRSAPT 2398 

III III 
2146 STPGLQHPKRVSRFLEDPSSVET 2168 



RESULT 9 
■Q91V24 

ID Q91V24 PRELIMINARY; PRT; 2159 AA. 

AC Q91V24; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel . , 25, Last annotation update) 

DE ATP-binding cassette transporter sub-family A member 7. 

GN ABCA7. 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=01129, and DBA/2; 

RX MEDLINE=21328888; PubMed=11435699; 

RA Broccardo C. f Osorio J., Luciani M.-F., Schriml L.M., Prades C, 

RA Shulenin S., Arnould I., Naudin L., Laf argue C, Rosier M. , Jordan B., 

RA Mattei M.G., Dean M. , Denefle P., Chimini G. ; 

RT "Comparative analysis of the promoter structure and genomic 

RT organization of the human and mouse ABCA7 gene encoding a novel ABC A 

RT transporter."; 

RL Cytogenet. Cell Genet. 92:264-270(2001). 

DR EMBL; AF287142; AAK56863.1; -. 

DR EMBL; AF287141; AAK56862.1; 

DR MGD; MGI: 1351646; Abca7 . 

DR GO; GO: 0016020; Cimembrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO: 0004601; F:peroxidase activity; IEA. 

DR GO; GO: 0006979; P: response to oxidative stress; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR002016; Peroxidase. 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_T RAN S PORT ER_1 ; 1. 

DR PROSITE; PS50893; AB C_T RAN S PORT ER_2 ; 2. 

DR PROSITE; PS00435; PEROXIDASE_l ; 1. * 

KW ATP-binding. 

SQ SEQUENCE 2159 AA; 236882 MW; CD2BE3FE0D8B822B CRC64; 

Query Match 29.0%; Score 3675; DB 11; Length 2159; 

Best Local Similarity 35.4%; Pred. No. 1.9e-234; 

Matches 904; Conservative 371; Mismatches 733; Indels 544; Gaps 68; 

Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I I I I I I I I : I I I I I : I I I I I I I :: I I : I I I I I I 

Db 6 QLMLLLWKNYTYRRRQPIQLLVELLWPLFLFFILVAVRHSHPPLEHHECHF-PNKPLPSA 64 

QY 66 GILPVMQSLCPDGQRDEF GFL-QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I : I : I I : I I I : : I : : : I I III: 



65 GTVPWLQGLVCNVNNSCFQHPTPGEKPGVLSNFKDSLISRLLAD-TRTVLGGHSIQDMLD 123 



116 SLGSELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNS 175 

: I I : I I : I I I : I III:::: 
124 ALGKLIPVLR AVGGGARPQESDQPT SQGSVTKLLEKI 160 

176 TAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALL 235 

I I : I I : I : I : | : | : | | | | : | : 

161 LQRASLDP VLG QAQDSMRKFSDAIRDLA QELLTLPSLM 198 

236 EQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV7VK 295 

II : I I 1:1: : I : I I 
199 ELRALLRRPRGSAGSLELVSEALCS 223 

296 VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGR 355 

224 223 

356 T P G P PAS GAG GAAN GT GAG AV MG PNAT AE E GAP S AAALAT P D- T LQGQ C S AFV 407 

I I I : I I : I | : | | | III I I I : I I I I I 

224 TKGPSSPG-GLSLNWYEANQLNEFMGP EVAP ALPDNSLSPACSEFVGTLD 272 

4 08 QLWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKIL 461 

I I I : I :: I III 
273 DHPVSRLLWRRLKPLILG KIL 293 

462 YAPAGSEVDRVILKANETFAFVGNVTHYAQVWLNISAEIRSFLEQ 506 

: I I : : : : : | : | | : : : : | : : | : | : 

294 FAPDTNFTRKLMAQWQTFEEL7VLLRDLHELWGVLGPQIFNFMNDSTNVAMLQRLLDVGG 353 

507 -GRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFSLPSGMALLQQLDTIDNAA 565 

I : I I I I : : | | | : I I I : : : I I : : I I 

354 TGQRQQTPR AQKKL — EAIK DFLDPS — RGGYSWREAHADMGRLAGILG-- 398 

566 CGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFASVIF QTRKD 615 

I I I I : I : I I I :: I : I : ::| ::| : 

399 QMMECVSLDKLEAVPSEEALVSRALELLGERR— LWAGIVFLSPEHPLDPSELSSP 452 

616 GSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLY GFVWIQDMMERAII 672 

I I : : I I I : : I I : I I : I I I I : I I I I : : I I : : I : I : 

453 AliSPGHLRFKIRMDIDDWRTNKIRDKFWDPGPSADPFMDLRYWGGFVYLQDLLEQAAV 512 

673 DTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVI SWVYSVAMTIQHIVAEKE 732 

I : I I : I hill MM: : I I : : : I : I I I I : I : : : I I I I 

513 RVLGGGN-SRTGLYLQQMPHPCWDDVFLRVLSRSLPLFLTLAWIYSV7VLTVKAWREKE 571 

733 H RLKEVMKTMGLNNAVHWVAWFI T GFVQL S I SVT ALTAI LKYGQVLMH S HWI I WL FLAV 792 

I I : I I : I I I : I I I : I I : : : I I : I II : I : I I I : I : I I I I 

572 TRLRETMR7\MGLSRAVLWLGWFLSCLGPFLVSAA1jLVLVLKLGNILPYSHPVVI FLFLAA 631 

7 93 YAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPY-MYVAIREEVAHDKITAFEKCIA 851 

: I I I I : I I : I : I : II I : I I I I : I I I : I I : I I I I : : I I 
632 FAVAT VAQ S FL L S AFFS RAN LAAAC GGLAY FAL YL P YVLC VAWRE RLH LGGL LA A 686 

852 SIJVISTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLIAVTML1WDAVVYGILT 911 

I I : I MM: M | Mill II M : I I : : I I I : I I : 

687 SLLSPVAFGFGCESLALLEEQGDGAQWHNLGTGPAE-DVFSLAQVSAFLLLDAVIYGLAL 745 



912 WYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMES 971 

I I : I II I I I I : I I I I I :: I I I I I I : I I : I I : 

746 WYLEAVCPGQYGI PEPWNFPFRRS YWCGPG PPKSSVL APAPQD 788 

972 RRFEETRGMEEEPTHLPLVVCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGK 1031 

: : I I I I I : I I : : : I I I : I : I I : : I I I I I I I I I 

789 PKVL VEEPPLGLVPGVSIRGLKKHFRGCPQPALQGLNLDFYEGHITAFLGHNGAGK 844 

1032 TTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSR 1091 

I I I : I I I : I I I I I : I I I I : I Ml:: I | | | : I I : I I I : Ml I I I Mill: Ml I 

845 TTTLSILSGLFPPSSGSASILGHDVQTNMAAIRPHLGICPQYNVLFDMLTVEEHVWFYGR 904 

1092 LKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPT 1151 

II : : : I ' : : : I I : I : II : : I I I I I : I I I I I I I I M I I I I : I : M I I 

905 LKGVSAAAMGPERERLIRDVGLTLKRDTQTRHLSGGMQRKLSVAIAFVGGSRWIMDEPT 964 

1152 AGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRIAI I SHGKLKCCGS PLFLKG 1211 
I I I I I : II I I : I : I II : I I I : : I II I I : I I I : I I II I : I : : : I I I I I II I I I : 
965 AGVDPASRRGIWELLLKYREGRTLILSTHHLDEAELLGDRYAMVAGGSLCCCGSPLFLRR 1024 

1212 TYGDGYRLTLVKRPAE PGGPQEP GLASS PP 1241 

I I I I I I I I I : : I II I 

1025 HLGCGYYLTLVKSSQSLVTHDAKGDSEDPRREKKSDGNGRTSDTAFTRGTSDKSNQAPAP 1084 

1242 GRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLD 1301 

I I : : I :::::: II III II : I I I I :. I : I I I : : I : 

1085 GAVP I T P - S TARI L E LVQQHVP GAQ LVE DL P H EL LLVL P YAGAL DGS FAMVFQ E L DQQ LE 1143 

1302 ALHLS S FGLMDTTLEEVFLKVS EEDQS LENS EADVKES RKDVLPGAEGPASGEGHAGNLA 1361 

I I : : I : I I I I I : I I I I I : III 
1144 LLGLTGYGI SDTNLEEI FLKWED AHREG 1172 

1362 RCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFD NPQDPDNVSLQEVEA 1417 

I I II II I : I 

1173 GDSRPQLHLRTCTPQPPTGPEASVLEN 1199 

1418 EALSRVG— QGSRKLD'GGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTV 1475 

I: I : I I :| II III III: : l|:|::||| MM: 

12 00 GEIAPQGIAPNAAQVQGWTLTCQQLRALLHKRFLLARRSRRGLFAQWLPALFVGIALFF 1259 

1476 ALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSP-DASPQQLVSTFR 1534 

MM I I I I I I I : I I : : : I : Ml: : | : 

1260 SLIVPPFGQYPPLQLSPAMY GPQVSFFSED APGDPNRMKLLEALL 1304 

1535 LPSGVGATCVLKS PANGS LGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPP 1594 

: I : : II I : I I I I I : I I 

1305 GEAGLQEPSMQDKDARG SECTHSLACYFTVPEVPPDVASILASGNWTPESP 1355 

1595 S PAP S DS PAS PDEDLQAWNVS LP PTAGPEMWT S APS LPRLVREPVRCTCS AQGTGF S 1651 

Ml I I I I 

1356 SPA CQCSQPGARRLLPD 1372 

1652 CPSSVGGHPPQMRW-TGDILTDITGHNVSEYLLFTSDRFRLH RYGAIT 1699 

II: I I I I I - I : : : : : I I I I I :: I : I III : 

1373 CPAGAGGPPPPQAVAGLGEVVQNLTGRNVSDFLVKTYPSLVRRGLKTKKWVDEVRYGGFS 1432 
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Db 
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Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1700 FGNVLKS I PAS FGTRAP PMVRKI A VRRAAQV 1730 

I : I : I I : I I : : : 

1433 LGGRDPDLPTGH EWRTLAEI RALLS PQPGNALDRI LNNLTQWALGLDARNSLKI 1487 

1731 FYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPA — AYGITVTNHPMNKTSASLSLDYLL 1788 

::IMI:|:| ::| II :| I II I |: II I I I : I I II h 

1488 WFNNKGWHAMVAFVNRANNGLLHALLP SGPVRHAHSITTLNHPLNLTKEQLSEATLI 1544 

1789 -QGTDWI AI FI I VAMS FVPAS FWFLVAEKSTKAKHLQFVSGCNPI I YWLANYVWDMLN 1847 

I I : : : I : : I I I I I I I I I : I : I : I : I I I I I III : I I I I : : I I I I 
1545 ASSVDVLVSICWFAMSFVPAS FTLVLIEERITRAKHLQLVSGLPQTLYWLGNFLWDMCN 1604 

1848 Y L VP AT C C VI I L FV- FD L PAY T S P T N F P AVL S L FL L Y GW S I T P I MY PAS FW F EVP S S AYV 1906 

III I I I : : I : I II : I I . I I : I I I I I I I I I I I : I I I I I I :'l I I I : I I I 
1605 YL V- AVC I WF I FLAFQQ RAYVAP EN L PAL L L L L L L YGW S I T P LMY PAS F FF S VP S T AYV 1663 

1907 FLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEY 1966 

I I I I I I I I : : : I I I : I : I I : : I : I : II I I I I I : : I I I I : : I I : 
1664 VLTCINLFIGINSSMATFVLELLS-DQNLQEVSRILKQVFLIFPHFCLGRGLIDMVRNQA 1722 

1967 INEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTK 2026 

: : : : : I : : I I ' I I I : : I : I I : I : I : I : : I : II I 

1723 MADAFERLGD-KQFQSPLRWDIIGKNLLAiyn^AQGPLFLLITLLLQHRNRLLPQSKPRLLP 1781 

2027 P V- E D DVD VAS E RQ RVL RG D ADN DMVK I EN LT KVY KSRKIGRI LAVD RL C L GVR P G EC FG 2085 

I : I : I I I I I I : I I : I I : : : : I I I I I : : I I I I I I I I I : I I I I I I 
1782 P LGEE DE DVAQ E RE RVT KGAT Q GDVLVL RD LT KVY RGQ RNPAVDRLCLGIPPGECFG 1838 

2086 LLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTA 2145 

I I I I I I I I I I I I I : I : I I I : III : I I : I : I : I : I I I I I I I : I I II 

1839 LLGVNGAGKTSTFRMVTGDTLPSSGEAVLAGHNVAQERSAAHRSMGYCPQSDAIFDLLTG 1898 

214 6 REHLQLYTRLRGI SWKDEARWKWALE KLELTKYADKPAGTYSGGNKRKLSTAIALI 2202 

I I I I : I : I I I I : I I : I : I I : I I I I I : I I I I I I I I I I I I I : I I : I I : 

1899 REHLELFARLRGV P EAQVAQT AL S GLVRL GL P S YAD RP AGT Y S GGN KRKLAT ALALV 1955 

2203 GYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNG 2262 

I II : II I I I I I II I I I I I I I I I : I :: : I I I I I I I I I I II I I II I II I I I I I II I 
1956 GDPAWFLDEPTTGMDPSARRFLWNSLLSVWEGRSWLTSHSMEECEALCTRLAIMVNG 2 015 

2263 RLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQL- 2321 

I I I I I I I I I I I I I I : : I : I I : : I I I I I : I I : : : : : I I 

2016 RFRCLGSSQHLKGRFGAGHTLTLRVPPDQP-EPAIAFIRITFPGAELREVHGSRLRFQLP 2074 

2322 KSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQ 2381 

: I : I I : : I : I I : I I I I | I I : I I : I : I I : I | 

2075 PGGRCTLTRVFRELAAQGRAHGVEDFSVSQTTLEEVFLYFSKDQGEEEESSRQEAEEEEV 2134 

2382 SPLGCLLSLLRPRSAPTELRALVADEPEDLDT 2413 

II II: : I I :: I 
2135 SKPG RQHPKRVSRFLED-PSSVET 2157 



RESULT 10 
Q9BZC4 

ID Q9BZC4 PRELIMINARY; PRT; 2146 AA. 

AC Q9BZC4; 



DT Ol-JUN-2001 (TrEMBLrel. 17, Created) 

DT Ol-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ABC transporter member 7. 

GN ABCA7 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21328888; PubMed=114 35699 ; 

RA Broccardo C, Osorio J., Luciani M.-F., Schriml L.M., Prades C, 

RA Shulenin S., Arnould I., Naudin L., Laf argue C, Rosier M. , Jordan B., 

RA Mattei M.G., Dean M., Denefle P., Chimini G. ; 

RT "Comparative analysis of the promoter structure and genomic 

RT organization of the human and mouse ABCA7 gene encoding a novel ABC A 

RT transporter . " ; 

RL Cytogenet. Cell Genet. 92:264-270(2001). 

DR EMBL; AF328787; AAK00959.1; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F : ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; ABC_TRANSPORTER_2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2146 AA; 234306 MW; 2391728D5AD97E75 CRC64; 

Query Match 28.6%; Score 3620; DB 4; Length 2146; 

Best Local Similarity 35.8%; Pred. No. 8.8e-231; 

Matches 899; Conservative 363; Mismatches 772; Indels 47 8; Gaps 60; 

Qy 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVP FYTAA 60 

II I I I I I II I : I I I I I : I I I I I I I :: 1 I : I I 
Db 1 MAFWTQLMLLLWKNFMYRRRQPVQLLVELLWPLFLFFILVAVRHSHPPLEHHECHF-PNK 59 

Qy ' 61 PLTSAGILPVMQSLCPDGQR DEFGFLQYANSTVTQLLERLDRWEEGNLFD 111 

:| :| I : : I I I I :: I III 

Db 60 PLPSAGTVPWLQGLICNVNNTCFPQLTPGEEPGRLSNFNDSLVSRLLADARTVLGGASAH 119 

Qy 112 PARPSLGSELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLS 171 

I I : I I III 
Db 120 RTLAGLGKLIATLR AARSTAQ 140 

Qy 172 LPNSTAQALLAARVDPP— EVYHLLFGPSSALDSQS GLHKGQEPWSRLGGNPLFRME 226 

I I I : I : I I : I II : I I :: I I : I I I I 
Db 141 -PQPTKQSPL EPPMLDVAELL TSLLRTESLGLALGQAQEPLHSL 183 

Qy 227 ELLLAPALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAE 286 

I I I : : I 11:11 ! I : : I : I I If: 



Db 



184 -LEAAGDLAQELLALRSLVELRALLQRPRGTSGPLELLSEALCS VRGPSST 233 



Qy 287 LRNQLDVAKVSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALL 34 6 

: I : : I : I : I : I : I I I I I I 
Db 234 VGPSLNWYEASDLMELVGQEPESALPDSSLSPACSELIGAL DSHPLSRL 282 

Qy 347 L PQ GACT G RT P G P PAS GAGGAAN GT GAGAVMG P NAT AE EGAP S AAALAT P DT LQGQ C S AF 406 

Db 283 282 

Qy 407 VQLWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAG 466 

I I I : I : : I I : I : I I 

Db 283 — LWRRLKPLILG KLLFAPDT 301 

Qy 467 SEVDRVILKANETFAFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLH 526 

: : : : I I I : : : I I : I : I : I : I I II 

Db 302 PFTRKLMAQVNRTFEELTLLRDVREVWEMLGPRIFTFMNDSSNVAMLQRLLQMQDEGRRQ 361 

Qy 527 P EALNLSLDELPPALRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSK-VSV 577 

I I I I I I II I | : : : : : | : 

Db 362 PRPGGRDHMEALRSFLDP GSGGYSWQDAHADVGHLVGTLGRVTECLSL 409 

Qy 578 DI FKGFPDEESI WYTLNQAYQDNVTVFASVI FQTRKDGSLPP HVHYKIR 627 

I : I I : : I : I : : I I : I : I I I I I I II 

Db 410 DKLEAAPSEAALVSRALQLLAEHR — FWAGWFLGPEDSSDPTEHPTPDLGPGHVRIKIR 4 67 

Qy 628 QNS S FTEKTNEI RRAYWRPGPN TGGRFYFLYGFVWIQDMMERAI I DT FVGHDWEP 683 

: : I I : I I : I I I I I II I I I : : I I : : I I I : I : 

Db 468 MDIDVVTRTNKIRDRFWDPGPAADPLTDLR-YVWGGFVYLQDLVERAAVRVLSGAN-PRA 525 

Qy 684 GSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAMTIQHIVAEKEHRLKEVMKTMG 743 

I I : I I I I I I I II I : : I I : : : I : I I I : I : : : I I I I I I : : I : I I 
Db 526 GLYLQQMPYPCYVDDVFLRVLSRSLPLFLTLAWIYSVTLTVKAWREKETRLRDTMRAMG 585 

Qy 744 LNNAVHWVAW F I T G FVQ LSI S VTALT AI LK YGQVLMH S H WI I W L FLAVYAVAT IM FC FL 803 

I : I I I : I I : I I : I I I : I : I I : : : I I I I : I I I I : II 

Db 586 LSRAVLWLGWFLSCLGPFLLSAALLVLVLKLGDILPYSHPGWFLFLAAFAVATVTQSFL 645 

Qy 804 VSVLYSKAKLASACGGIIYFLSYVPY-MYVAIREEVAHDKITAFEKCIASLMSTTAFGLG 862 

:| : I : I I I : I I I I : I I |:|| : II I I : : I : |||:| III I 

Db 64 6 LSAFFSRANLAAACGGLAYFSLYLPYVLCVAWR DRLPAGGRVAASLLSPVAFGFG 7 00 

Qy 863 S K Y FAL Y EVAGVG I QWHT F S Q S P VE GD D FN LLLAVTMLMVDAVVYG I LTW Y I EAVH P GMY 922 

= Ml I I III I 11:1 :|::| | :| |: I | | :| | | | I | 

Db 7 01 CESIALLEEQGEGAQWHNVGTRPT-ADVFSLAQVSGLLLLDTV/VLYGLATWYLEAVCPGQY 759 

Qy 923 GLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRRFEETRGMEE 982 

I : I ! I I I : : I I I I I I . I : : I II 
Db 760 GIPEPWNFPFRRSYWCGP-RPPKSPAPCPTPLDPKVLV EE 798 

Qy 983 EPTHLPLVVCVDKLTKVYKDDKKLALNKLSLNLYENQVVS FLGHNGAGKTTTMS I LTGLF 1042 

I I II II: : I I I I I : I : : : I I I I I I I I I I II : I I I : I I I 
Db 799 APPGLSPGVSVRSLEKRFPGSPQPALRGLSLDFYQGHITAFLGHNGAGKTTTLSILSGLF 858 

Qy 1043 PPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRR 1102 

II: III I I I I : I : I II : I I : I I I : I I I I I I I I : I I : I I I I I I : : : 
Db 859 PPSGGSAFILGHDVRSSMAAIRPHLGVCPQYNVLFDMLTVDEHWFYGRLKGLSAAWGP 918 



1103 EMDKMI EDLELSNKRHS LVQTL S GGMKRKLS VAI AFVGGS RAI I LDEPTAGVDP YARRAI 1162 
I I :|: :. I I I I I : I I I I I I I I I I I I I : :|||||IMIII :|| I 

919 EQDRLLQDVGLVSKQSVQTRHLSGGMQRKLSVAIAFVGGSQWILDEPTAGVDPASRRGI 97 8 

1163 WDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLV 1222 

I : I : I I I : I I I : : I I I I I : I I I : I I I I I : I : : : I : I I I I I I I I I : I II MM 

979 WELLLKYREGRTLILSTHHLDEAELLGDRVAWAGGRLCCCGSPLFLRRHLGSGYYLTLV 1038 

1223 KR — PAEPGGPQEPGLASSPPGRAPLSSCSE LQVSQFIRKHVASCLLVSDTSTE 1274 

II : : | | : | : | : : : | M : I 
1039 KARLPLTTNEK7\I)TDMEGSVDTRQEKKNGSQGSRVGTPQLIALVQHWVPGARLVEELPHE 1098 

1275 LSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEA 1334 

I Ml I I : I I I : I : I I I : : I : I I : I I I : M I I II I 
1099 LVLVLPYTGAHDGSFATLFRELDTRLAELRLTGYGISDTSLEEIFLKWEE CAA 1152 

1335 DVKESRKDVLPGAEGPASGEGHAG-NLARCSELTQSQASLQSASSVGSA-RGDEGAGYTD 1392 

I I : I : I III:: : : : : I : : III I : I : I 
1153 DT DMEDGSCGQHLCTGIAGLDVTLRLKMPPQETALENGEPAGSAPETDQGSG 1204 

1393 VYGDYRPLFDNPQDPDNVSLQEVE7VEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCAR 1452 

Ml I : I I : I : I I I : I I I I I ' 

1205 PDAVG— RVQGWALTR QQLQALLLKRFLLAR 1233 

1453 RNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEE 1512 

I : : I I : I I M I I II : I : MM I I I I I I : I I : : : I : 

1234 RSRRGLFAQIVLPALFVGLALVFSLIVPPFGHYPALRLSPTMY GAQVSFFSED 1286 

1513 RREYRLRLSP-DASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARF 1571 

Ml M: :| I : I I : I II 
1287 APGDPGRARLLEALLQEAG LEEP PVQHSSH RF 1318 

1572 FDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSL 1631 
: I Ml MM 

1319 SAP EVP AEVAKVLAS GNWT PES PS PA 1344 

1632 P RLVREP VRCT C S AQGT GF SCPSSVGGHPPQMRW-TGDILTDITGHNVSEYLLFTS 1687 

I II I Ih: II II I :|::: ::|| |:|::|: I 

1345 CQCSRPGARRLLPDCPAAAGGPPPPQAVTGSGEWQNLTGRNLSDFLVKTY 1395 

1688 DRFRLH RYGAITFG — : NVL 1704 

I I I I : I ' M 

1396 PRLVRQGLKTKKWVNEVRYGGFSLGGRDPGLPSGQELGRSVEELWALLSPLPGGALDRVL 1455 

1705 KSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPA 1764 

I : : I : :::::: I I I I : I I I : : I M I II II M I II 
1456 KNLTA WAHSLDAQDSLKIWFNNKGWHSMVAFVNRASNAILRAHLPP GPA 1504 

1765 — AYGITVTNHPMNKTSASLSLDYLL-QGTDWIAIFIIVAMSFVPAS FWFLVAEKSTK 1821 

MM II I : I I II I : I I : : : I : : I I I I I I I I I : I : I : I : 
1505 RHAH S I T T LNH P LN LT KEQL S EAALMAS S VDVLVS I C WFAM S FVPAS FTLVLI E E RVT R 1564 

1822 AKHLQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPAT CC VI I L FVFDL P AYT S PTN FPAVL S L FL 1881 

I I I I I : I M : I I I I : : I I I I I I I I I Ml I II M l MM I I 
1565 AKHLQLMGGLSPTLYWLGNFLWDMCNYLVPACIWLIFLAFQQRAYVAPT^NLPALLLLLL 1624 



Qy 1882 LYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSY 1941 

I I I I II I I : I I I I I I : I llhlll I I I I I I II :: I I I : I : I I I : I : I : 

Db 1625 LYGWSITPLMYPASFFFSVPSTAYWLTCINLFIGINGSMATFVLELFS-DQKLQEVSRI 1683 

Qy . 1942 LKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIWRGLVAMAVEGV 2001 

II I I I I I : : I I I I : : I I :::::: | : : | | I : : I : I : I I : : I 

Db 1684 LKQVFLIFPHFCLGRGLIDMVRNQAM7U)AFERLGD-RQFQSPLRWEWGKNLLAMVIQGP 1742 

Qy 2002 VGFLLTIMCQYNFLRRPQRMP VSTKPV— EDDVDVASERQRVLRGDADNDMVKIENL 2056 

: I I : : I : I I : I I : I : I : I I I I I I : I I : : I I :: : I I 

Db 1743 LFLLFTLLLQH RSQLLPQPRVRSLPLLGEEDEDVARERERWQGATQGDVLVLRNL 1798 

Qy 2 057 TKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNG 2116 

MM: : | : MINIM: I I I I M I M I I M I I I I I I : I : I I I : III : I 
Db 1799 TKVYRGQ RMP AVDRLCLGI P P GEC FGLLGVN GAGKT S T FRMVT GDT LAS RGEAVLAG 1855 

Qy 2117 HSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDE7VRWKW7VLEKLEL 2176 

IIIM I : M I I I II : I : I I II I I : I II I I : I : I : I I 

Db 1856 HSVAREPSAAHLSMGYCPQSDAIFELLTGREHLELLARLRGVPEAQVAQTAGSGLARLGL 1915 

Qy 2177 TKYADKPAGT YSGGNKRKLSTAIALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTG 2236 

: I I I : II I I I I II I I I M : I I : II : I II : I I I I I I I I I I I II I I I II M : : : I 
Db 1916 SWYADRPAGTYSGGNKRKLATALALVGDPAWFLDEPTTGMDPSARRFLWNSLLAWREG 1975 

Qy 2237 RSVVLTSHSMEECE^CTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDV 2296 

I M : I II I I I II I II II : I I I I I I II I I II I I II I I II I : MM :: : | : 
Db 1976 RSVMLTSHSMEECEALCSRLAIMVNGRFRCLGSPQHLKGRFAAGHTLTLRVPAARS-QPA 2034 

Qy 2297 VRFFNRNFPEAMLKERHHTKVQYQL-KSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLD 2355 

I 11:1:11 : : : : I I ' : I I : I I : : I : I I : M I M I : 

Db 2 035 AAFVAAEFPGSELREAHGGRLRFQLPPGGRCAL7VRVFGELAVHGAEHGVEDFSVSQTMLE 2094 

Qy 2356 NVFVNFAK— KQSDNLEQQE TEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I : I : I IIIMI : I III I I : I I I 

Db 2095 EVFLYFSKDQGKDEDTEEQKEAGVGVDPAPGLQHPKRVSQFLDDPSTAETVL 2146 



RESULT 11 
Q8IZY2 



ID Q8IZY2 PRELIMINARY; PRT; 2146 AA. 

AC Q8IZY2; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ABC transporter ABCA7 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20549028; PubMed=11095984 ; 

RA Kaminski W.E., Piehler A., Schmitz G. ; 

RT "Genomic organization of the human cholesterol-responsive ABC 

RT transporter ABCA7 : tandem linkage with the minor histocompatibility 

RT antigen HA-1 gene."; 

RL Biochem. Biophys . Res. Commun . 278:782-789(2000). 



DR EMBL; AF311102; AAN04657.1; -. 

DR EMBL; AF311057; AAN04 657.1; JOINED. 

DR EMBL; AF311058; AAN04657.1; JOINED. 

DR EMBL; AF311059; AAN04657.1; JOINED. 

DR EMBL; AF311060; AAN04657.1; JOINED. 

DR EMBL; AF311061; AAN04657.1; JOINED. 

DR EMBL; AF311062; AAN04657.1; JOINED. 

DR EMBL; AF311063; AAN04657.1; JOINED. 

DR EMBL; AF311064; AAN04657.1; JOINED. 

DR EMBL; AF311065; AAN04657.1; JOINED. 

DR EMBL; AF311066; AAN04657.1; JOINED. 

DR EMBL; AF311067 ; AAN04657 . 1 ; JOINED. 

DR EMBL; AF311068; AAN04657.1; JOINED. 

DR EMBL; AF311069; AAN04657.1; JOINED. 

DR EMBL; AF311070; AAN04657.1; JOINED. 

DR EMBL; AF311071; AAN04657.1; JOINED. 

DR EMBL; AF311072; AAN04657.1; JOINED. 

DR EMBL; AF311073; AAN04657.1; JOINED. 

DR EMBL; AF311074; AAN04657.1; JOINED. 

DR EMBL; AF311075; AAN04657.1; JOINED. 

DR EMBL; AF311076; AAN04657.1; JOINED. 

DR EMBL; AF311077; AAN04657.1; JOINED. 

DR EMBL; AF311078; AAN04657.1; JOINED. 

DR EMBL; AF311079; AAN04657.1; JOINED. 

DR EMBL; AF311080; AAN04657.1; JOINED. 

DR EMBL; AF311081; AAN04657.1; JOINED. 

DR EMBL; AF311082; AAN04657.1; JOINED. 

DR EMBL; AF311083; AAN04657.1; JOINED. 

DR EMBL; AF311084; AAN04657.1; JOINED. 

DR EMBL; AF311085; AAN04657.1; JOINED. 

DR EMBL; AF311086; AAN04657.1; JOINED. 

DR EMBL; AF311087; AAN04657.1; JOINED. 

DR EMBL; AF311088; AAN04657.1; JOINED. 

DR EMBL; AF311089; AAN04657.1; JOINED. 

DR EMBL; AF311090; AAN04657.1; JOINED. 

DR EMBL; AF311091; AAN04657.1; JOINED. 

DR EMBL; AF311092; AAN04657.1; JOINED. 

DR EMBL; AF311093; AAN04657.1; JOINED. 

DR EMBL; AF311094; AAN04657.1; JOINED. 

DR EMBL; AF311095; AAN04657.1; JOINED. 

DR EMBL; AF311096; AAN04657.1; JOINED. 

DR EMBL; AF311097; AAN04657.1; JOINED. 

DR EMBL; AF311098; AAN04657.1; JOINED. 

DR EMBL; AF311099; AAN04657.1; JOINED. 

DR EMBL; AF311100; AAN04657.1; JOINED. 

DR EMBL; AF311101; AAN04657.1; JOINED. 

DR Genew; HGNC:37; ABCA7 . 

DR GO; GO: 0016020; Crmembrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO; 0006810; P: transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 



DR PROSITE; PS00211; ABC_TRANSPORTER__l ; 1. 
DR PROSITE; PS50893; ABC_TRANSPORTER_2 ; 2. 

SQ SEQUENCE 2146 AA; 234422 MW; 33A128 082D7B5BAF CRC64; 

Query Match 28.6%; Score 3618; DB 4; Length 2146; 

Best Local Similarity 35.7%; Pred. No. 1.2e-230; 

Matches 898; Conservative 362; Mismatches 774; Indels 478; Gaps 60; 

Qy 1 MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 

II I I I I I I I I : I I I I I : I I I I I I I : : I I : I I 
Db 1 MAFWTQLMLLLWKNFMYRRRQPVQLLVELLWPLFLFFILVAVRHSHPPLEHHECHF-PNK 59 

Qy 61 PLTSAGILPVMQSLCPDGQR — DEFGFLQYANSTVTQLLERLDRWEEGNLFD 111 

I I I I I : I : I I : : I I I I : : I III 

Db 60 PLPSAGTVPWLQGLICNVNNTCFPQLTPGEEPGRLSNFNDSLVSRLLADARTVLGGASAH 119 

Qy 112 PARPSLGSELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLS 171 

I I : I I III 
Db 120 RTLAGLGKLIATLR AARSTAQ 14 0 

Qy 172 LPNSTAQALLAARVDPP— EVYHLLFGPSSALDSQS GLHKGQEPWSRLGGNPLFRME 226 

I I I : I : I I : I II : I I : : I I : I I I I 
Db 141 -PQPTKQSPL EPPMLDVAELL TSLLRTESLGLALGQAQEPLHSL 183 

Qy' 227 ELLLAPALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAE 286 

I I I : : I II : I I I I : : I : I I I I : 

Db 184 -LEAAEDLAQELLALRSLVELRALLQRPRGTSGPLELLSEAI.es VRGPSST 233 

Qy 287 LRNQLDVAKVSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVT)VLSAIALL 34 6 

: I : : I : I : I : hi I I III 
Db 234 VGPSLNWYEASDLMELVGQEPESALPDSSLSPACSELIGAL DSHPLSRL 282 

Qy 347 L PQ GACT G RT P G P PAS GAGGAAN GT GAGAVMG PN AT AE EGAP S AAALAT P DT LQGQC S AF 406 

Db 283 282 

Qy. 407 VQLWAGLQP I LCGNNRT I EP EALRRGNMS S LGFTS KEQRNLGLLVHLMT SNPKI L YAP AG 466 

I I I : I : : I I : I : I I 

Db 2 83 — LWRRLKPLILG KLLFAPDT 301 

Qy 467 SEVDRVILKANETFAFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLH 526 

: : : : I I I : : : I I : I : I : I : I I II 

Db 302 PFTRKLMAQVNRTFEELTLLRDVREVWEMLGPRIFTFMNDSSNVAMLQRLLQMQDEGRRQ 361 

Qy 527 P EALNLSLDELPPALRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSK-VSV 577 

I I I I II II I I : : : : : | : 

Db 362 PRPGGRDHMEALRSFLDP GSGGYSWQDAHADVGHLVGTLGRVTECLSL 409 

Qy 578 DIFKGFPDEESIVNYTLNQAYQDNVTVFASVI FQTRKDGSLPP HVHYKIR 627 

I : I I : : I : I : : I I : I : I I I I I I I I 

Db 410 DKLEAAPSEAALVSRALQLLAEHR— FWAGWFLGPEDSSDPTEHPTPDLGPGHVRIKIR 467 

Qy 628 QNSSFTEKTNEIRRAYWRPGPN TGGRFYFLYGFVWIQDMMERAI I DTFVGHDWEP 683 

: : I I : I I : I I I I. I II I II : : I I : : I I I : I : 

Db 4 68 MDIDV^RTNKIRDRFWDPGPAADPLTDLR-YWGGFVTLQDLVERT^AVRVLSGAN-PRA 525 



QY 



684 GS YVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAMTIQHIVAEKEHRLKEVMKTMG 743 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 



I I : I I I I I I INI: : I I : : : I : I I I : I : : : I I I I I I : : I : II 

526 GLYLQQMPYPCYVDDVFLRVLSRSLPLFLTLAWIYSVTLTVKAWREKETRLRDTMRAMG 585 

744 LNNAVHWVAW FI TGFVQL S I SVT ALTAI LKYGQVLMHS H WI I WL FLAVYAVAT IMFC FL 803 

I : I I I : I I : : : I I : I I I : I : I I : : : I I I I : I I I I : II 

586 LSRAVLWLGWFLSCLGPFLLSAALLVLVLKLGDILPYSHPGWFLFLAAFAVATVTQSFL 645 

804 VSVLYSK7VKLASACGGIIYFLSYVPY-MYVAIREEVAHDKITAFEKCIASLMSTTAFGLG 862 

:| : I : I I I : I I I I : II I : I I : II I | : : I : |||:| Ml I 

646 LSAFFSRANLAAACGGLAYFSLYLPYVLCVAWR D RL P AGG RVAAS L L S P VAFG FG 700 

863 S K Y FAL YE VAG VG I QWHT F S Q S P VE G D D FN L L LAVTMLMVD AWYG I LT W Y I EAVH P GM Y 922 

: M I Mill I I hi :|::M :||: llhlll Ml 

701 CESLALLEEQGEGAQWHNVGTRPT-ADVFSLAQVSGLLLLDAALYGLATWYLEAVCPGQY 759 

923 GLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVMEEDQACAMESRRFEETRGMEE 982 

I : I I I I I : : I I II I I I : : I | | 

760 GIPEPWNFPFRRSYWCGP-RPPKSPAPCPTPLDPKVLV EE 798 

983 EPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLF 1042 

I I M M : Ml I I I M : : : I I I II I I II I I I : I M : I I I 

799 APPGLSPGVSVRSLEKRFPGSPQPALRGLSLDFYQGHITAFLGHNGAGKTTTLS.ILSGLF 858 

1043 PPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRR 1102 
I I M I I I I I I : I M I I : I I : I I I M I I II I I I : I I : I I I M I : : : 
859 PPSGGSAFILGHDVRSSMAAIRPHLGVCPQYNVLFDMLTVDEHVWFYGRLKGLSAAWGP 



1103 EMDKMI EDLELSNKRHS LVQTLS GGMKRKLS VAI AFVGGS RAI I LDEPTAGVDP YARRAI 
I I : : : : I : I : I : : I I I I I : M I II II I I I I I I : M I I I M I I II I Ml I 
919 EQDRLLQDVGLVSKQSVQTRHLSGGMQRKLSVAIAFVGGSQWILDEPTAGVDPAS RRGI 



918 



1162 



978 



1222 



1163 WDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLV 
IMMII: I I I : M M I I M I I M I I I I M : : : IM IMIIMI: I II MM 
979 WELLLKYREGRTLILSTHHLDEAELLGDRVAWAGGRLCCCGSPLFLRRHLGSGYYLTLV 1038 

1223 KR — PAEPGGPQEPGLASSPPGRAPLSSCSE LQ VS Q F I RKHVAS C L L VS D T S T E 1274 

I I : : I I : I : I : : : I I I : I 

1039 KARLPLTTNEKADTDMEGSVDTRQEKKNGSQGSRVGTPQLLALVQHWVPGARLVEELPHE 1098 

1275 LSYILPSE7VAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEA 1334 

I Ml I I M MM: I I I : : I M I M I I : M I I II I 
1099 LVLVLPYTGAHDGSFATLFRELDTRLAELRLTGYGISDTSLEEIFLKWEE — CAA 1152 



1335 



1153 



1393 



1205 



1453 



DVKESRKDVLPGAEGPASGEGHAG-NLARCSELTQSQASLQSASSVGSA-RGDEGAGYTD 1392 

I I : I M III:: : : : : I : : III I : I : I 

DT DMEDGSCGQHLCTGIAGLDVTLRLKMPPQETALENGEPAGSAPETDQGSG 1204 

VYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCAR 1452 

III I : I I M : I II : I I I II 
PDAVG— RVQGWALTR QQ LQALLLKRFLLAR 1233 



RNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEE 1512 
I : M I : I I : II I I I : I : MM I I II I I M I : : : I : 

1234 R S RRGL FAQ I VL PAL FVG LAL VF S L I VP P FGH Y P AL RL S P TMY GAQVSFFSED 128 6 

1513 RREYRLRLSP-DASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARF 1571 
Ml M : M I : I I : I | | 



Db 1287 APGDPGRARLLEALLQEAG LEEP PVQHSSH RF 1318 

Qy 1572 FDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSL 1631 

: I 1:1 I I I I 

Db 1319 SAP E VP AE VAKVLAS GNWT PESPSPA 1344 

Qy 1632 PRLVREPVRCTCSAQGTGF SCPSSVGGHPPQMRW-TGDILTDiTGHNVSEYLLFTS 1687 

III I I I I I I I I : I : : : : : | | | : | : : | : | 

Db 1345 CQCSRPGARRLLPDCPAAAGGPPPPQAVTGSGEWQNLTGRNLSDFLVKTY 1395 

Qy 1688 DRFRLH RYGAITFG --NVL 1704 

I I I I : I I I 

Db 1396 PRLVRQGLKTKKWVNEVRYGGFSLGGRDPGLPSGQELGRSVEELWALLSPLPGGALDRVL 1455 



Qy 17 05 KS I PAS FGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNS LNNAI LRANLPKSKGNPA 17 64 

I : : I : :::::: I I I I : I I I : : I : I I I I I I : I I I I 

Db 1456 KNLTA WAHSLDAQDSLKIWFNNKGWHSMVAFVNRASNAILRAHLPP GPA 1504 

Qy 1765 — AYGI TVTNHPMNKT SAS L- S LDYLLQGTDWI AI FI I YAMS FVPAS FWFLVAEKSTK 1821 

I : I I llhll I : I I : : : I : : I I I I I I I I I : I : I : I : 

Db 1505 RHAHS ITTLNHPLNLTKEQLFEAALMAS SVDVLVS I CWFAMS FVPAS FTLVLI EERVTR 1564 

Qy 1822 AKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFL 1881 

Mill : I :| MM I.: : I I I llllll Ml I II :| I ||:| I I 
Db 1565 AKHLQLMGGLSPTLYWLGNFLWDMCNYLVPACIWLIFLAFQQRAYVAPANLPALLLLLL 1624 

Qy 1882 LYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSY 1941 

I I I I I I I I : I I I I I I : I llhlll I I II I I I I : : I I I : I .: I I I : I : | : 

Db 1625 LYGWSITPLMYPASFFFSVPSTAYWLTCINLFIGINGSMATFVLELFS-DQKLQEVSRI 1683 

Qy 1942 LKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGV 2001 

II I I I I I : : I I I I :: I I :::::: | : : | | | : : | : | : | | : : | 

Db 1684 LKQVFLIFPHFCLGRGLIDMVRNQAMADAFERLGD-RQFQSPLRWEWGKNLLAMVIQGP 1742 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



2002 VGFLLTIMCQYNFLRRPQRMP VSTKPV — EDDVDVASERQRVLRGDADNDMVKI ENL 2056 

: I I : : I : I I : I I : I : I :| I I I I I : I I : : I I : : : I I 

1743 LFLLFTLLLQH RSQLLPQPRVRSLPLLGEEDEDVARERERWQGATQGDVLVLRNL 1798 

2057 TKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNG 2116 

MM: : I : MINIM: II I M I I I I I M II I I I I I : I : I I I : II I : I 
1799 TKVYRGQ RMPAVDRLCLGIPPGECFGLLGVNGAGKTSTFRMVTGDTLASRGEAVLAG 1855 

2117 HSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLEL 2176 

I II : I I : I I I M I I : I : I I I II I : I MM: I : I : I I 

1856 HSVAREPSAAHLSMGYCPQSDAIFELLTGREHLELLARLRGVPEAQVAQTAGSGL7VRLGL 1915 

2177 TKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTG 2236 

: ll|:||||IMIIIIII:ll:||:| II : I I I It I I I I I I lllllll :| ::: I 
1916 SWYADRPAGTYSGGNKRKLATALALVGDPAWFLDEPTTGMDPSARRFLWNSLLAVVREG 1975 

2237 RSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDV 2296 

I I I : I I I I II I II I II I : M I M I M I Mill MM II I: MM :::| : 
1976 RSVMLTSHSMEECEALCSRLAIMVNGRFRCLGSPQHLKGRFAAGHTLTLRVPA7VRS-QPA 2034 

2297 VRFFNRNFPEAMLKERHHTKVQYQL-KSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLD 2355 

I M I I : I I : : : : II : I I : I I : : I : I I : II II I I : 

2035 AAFVAAEFPGAELREAHGGRLRFQLPPGGRCALARVFGELAVHG7VEHGVEDFSVSQTMLE 2094 



Qy 2356 NVFVNFAK KQSDNLEQQE TEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I I : I : I I I I I : I : I III I I : I I I 

Db 2095 EVFLYFSKDQGKDEDTEEQKEAGVGVDPAPGLQHPKRVSQFLDDPSTAETVL 214 6 



RESULT 12 
Q9NR73 

ID Q9NR73 PRELIMINARY; PRT; 2146 AA. 

AC Q9NR73; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Macrophage ABC transporter. 

GN ABCA7 . 

OS ' Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. " 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N . A. 

RX MEDLINE=20334305; PubMed=10873640; 

RA Kaminski W.E., Orso E . , Diederich W., Klucken J., Drobnik W., 

RA Schmitz G. ; 

RT "Identification of a Novel Human Sterol-Sensitive ATP-Binding Cassette 

RT Transporter (ABCA7 ) 

RL Biochem. Biophys. Res. Commun. 273:532-538(2000). 

DR EMBL; AF250238; AAF85794.1; -. 

DR GO; GO: 0016021; C: integral to membrane; TAS . 

DR GO; GO: 0005524; F: ATP binding; TAS. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; TAS. 

DR InterPro; IPR003593; AAA_ATPase . 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; ABC_T RAN S PORT ER__2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2146 AA; 234469 MW; 67 9B16EB2D75FF0D CRC64; 

Query Match 28.5%; Score 3616; DB 4; Length 2146; 

Best Local Similarity 35.7%; Pred. No. 1.6e-230; 

Matches 895; Conservative 364; Mismatches 777; Indels 474; Gaps 59; 

Qy 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPT I S VKEVP FYTAA 60 

II I I I I I I I I : I I I I I : I I I I I I I : : I I : I I 
Db 1 MAFWTQLMLLLWKNFMYRRRQPVQLLVELLWPLFLFFILVAVRHSHPPLEHHECHF-PNK 59 

Qy 61 PLTSAGILPVMQSLCPDGQR DEFGFLQYANSTVTQLLERLDRWEEGNLFD 111 

I I I I I : I : I I : : I I I . I : : I III 

Db 60 PLPSAGTVPWLQGLICNVNNTCFPQLTPGEEPGRLSNFNDSLVSRLLADARTVLGGASAH 119 



Qy 112 PARPSLGSELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLS 171 

I I : I I III 
Db 120 RTLAGLGKLIATLR — . AARSTAQ 140 



Qy 172 LPNSTAQALLAARVDPP — EVYHLLFGPSSALDSQS GLHKGQEPWSRLGGNPLFRME 226 

I I I : I : I I : I II : I I : : I I : I I I I 
Db 141 -PQPTKQSPL EPPMLDVAELL TSLLRTESLGLALGQAQEPLHSL 183 

Qy 227 ELLLAPALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAAR7VRRFSGLSAE 286 

I I I : : I 11:11 I I : : I : I I II: 
Db 184 -LEAAEDLAQELLALRSLVELRALLQRPRGTSGPLELLSEALCS VRGPSST 233 

Qy 287 LRNQLDVAKVSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALL 346 

: | : : | : | : I : hi I I III 
Db 234 VGPSLNWYEASDLMELVGQEPESALPDSSLSPACSELIGAL DSHPLSRL 282 

Qy 347 L PQGACT GRT PGP PAS GAGGAANGTGAGAVMGPNATAEEGAP S AAALAT PDTLQGQCSAF 406 

Db 283 282 

Qy 407 VQLWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAG 466 

I I I : I : : I I : I : I I 

Db 283 — LWRRLKPLILG KLLFAPDT 301 

Qy 4 67 SEVDRVILKANETFAFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLH 526 

: : : : I I I : : : I I : I : I : I : I I II 

Db 302 P FTRKLMAQVNRT FEELT LLRDVREVWEMLGPRI FT FMNDS SNVAMLQRLLQMQDEGRRQ 361 

Qy 527 P — : EALNLSLDELPPALRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSK-VSV 577 

I III II II I | : : : : : | : 

Db 362 PRPGGRDHMEALRSFLDP GSGGYSWQDAHADVGHLVGTLGRVTECLSL 409 

Qy 578 DIFKGFPDEESIVNYTLNQAYQDNVTVFASVI FQTRKDGSLPP HVHYKIR 627 

I : I I :: I : I : : I I : I : I I I I I I I I 

Db 410 DKLEAAPSEAALVSRALQLLAEHR — FWAGWFLGPEDSSDPTEHPTPDLGPGHVRIKIR 467 

Qy 628 QNSSFTEKTNEIRRAYWRPGPN TGGRFYFLYGFVWIQDMMERAIIDTFVGHDWEP 683 

: : I I : I I : I I I I I II I I I : : I I : : I I I : I : 

Db 468 MDIDWTRTNKIRDRFWDPGPAADPLTDLR-YVWGGFVYLQDLVERAAVRVLSGAN-PRA 525 

Qy 684 GSYVQMFPYPCYTRDDFLFVIEHMMPLCMVI SWVYSVAMTIQHIVAEKEHRLKEVMKTMG 743 

11:1 I I I I I MM: : | | : : : | : | | | : | : : : M I I I I : : I : I I 
Db 526 GLYLQQMPYPCYVDDVFLRVLSRSLPLFLTLAWIYSVTLTVKAWREKETRLRDTMRAMG 585 

Qy 744 LNNAVHWVAWFITGFVQLSI SVTALTAILKYGQVLMHSHVVI IWLFLAVYAVATIMFCFL 803 

I : I I I : I I : : : I I : I I I : I : I I : : : I I I I : I I I I : II 

Db 586 LSRAVLWLGWFLSCLGPFLLSAALLVLVLKLGDILPYSHPGWFLFLAAFAVATVTQSFL 645 

Qy 804 VSVLYSKAKLASACGGIIYFLSYVPY-MYVAIREEVAHDKITAFEKCIASLMSTTAFGLG 8 62 

: I : I : I II : I I I I : I I I : I I : I I I I :: I : I I I : I I I I I 

Db 646 LSAFFSRANLAAACGGLAYFSLYLPYVLCVAWR DRLPAGGRVAASLLSPVAFGFG 7 00 

Qy 863 S KYFALYEVAGVGI QWHT FSQS PVEGDDFNLLLAVTMLMVDAVVYGI LTWYI EAVHPGMY 922 

: III I II I I I 11:1 : I :: I I : I I : I I I : I I I I II 

Db 701 CESLALLEEQGEGAQWHNVGTRPT-ADVFSLAQVSGLLLLDAALYGLATWYLEAVCPGQY 759 . 

Qy 923 GLPRPWYFPLQKSYWLGSGRTEAWEWSWPW7VRTPRLSVMEEDQACAMESRRFEETRGMEE 982 

I : II II I :: I I I I I I I :: I I I 

Db 760 GIPEPWNFPFRRSYWCGP-RPPKSPAPCPTPLDPKVLV— EE 798 



Qy 



983 EPTHLPLVVCVDKLTKVYKDDKKLALNKLSLNLYENQVVS FLGHNGAGKTTTMSILTGLF 1042 



I I I I II: : I I I I I : I : : : I I I I I I I I I I I I : I II : I I I 

799 APPGLSPGVSVRSLEKRFPGSPQPALRGLSLDFYQGHITAFLGHNGAGKTTTLSILSGLF 858 

1043 PPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRR 1102 

I I : I I I I I I I : I : I I I : I I : I I I : I I I I I I I I : I I : I I I I I I : : • 

859 PPSGGSAFILGHDVRSSMAAIRPHLGVCPQYNVLFDMLTVDEHVWFYGRLKGLSAAWGP 918 

1103 EMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAI 1162 

I I : : : : I : I : I : : I I I I I : I I I II I I I I I II I : : I I I I I I I I II I : I I I 

919 EQDRLLQDVGLVSKQSVQTRHLSGGMQRKLSVAIAFVGGSQWILDEPTAGVDPASRRGI 978 

1163 WDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHGKLKCCGS PLFLKGT YGDGYRLTLV 1222 

I : I : I I I : I I I : : I I I I I : I I I : I I I I I : I : : : I : I I I I I I I I I : I II I I I I 

979 WELLLKYREGRTLILSTHHLDEAELLGDRVAWAGGRLCCCGSPLFLRRHLGSGYYLTLV 1038 

1223 KR — PAEPGGPQEPGLASSPPGRAPLSSCSE LQVS Q F I RKHVAS CLLVS DT S T E 1274 

II : : I I : I : I : : : I I I : I 
1039 KARLPLTTNEI<ADTDMEGSVDTRQEKKNGSQGSRVGTPQLIJVLVQHWVPGARLVEELPHE 1098 

1275 LSYILPSEA7VKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEA 1334 

I : I I I I : I I I : I : I I I : : I : I I : I I I : I I I I I I I 
1099 LVLVLPYTGAHDGSFATLFRELDTRLAELRLTGYGISDTSLEEIFLKWEE CAA 1152 

1335 DVKESRKDVLPGAEGPASGEGHAG-NLARCSELTQSQASLQSASSVGSA-RGDEGAGYTD 1392 

I I : I : I III:: : : : : | : : III I : I : I 
1153 DT DMEDGSCGQHLCTGIAGLDVTLRLKMPPQETALENGEPAGSAPETDQGSG 1204 

1393 VYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCAR 1452 

II I I : I I : I : I I I : I I I I I 

1205 P DAVG — RVQGWALT R QQLQALLLKRFLLAR 1233 

1453 RNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEE 1512 

I : : I I : I I : I I I I I : I : : I II I I I III: I I : : : I : 

1234 RSRRG'LFAQIVLPALFVGLALVFSLIVPPFGHYPALRLSPTMY GAQVSFFSED 1286 

1513 RREYRLRLSP-DASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARF 1571 

: T I : I : : I I : I I : I II 

1287 APGDPGRARLLEALLQEAG LEEP PVQHSSH RF 1318 

1572 FDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSL 1631 
: I I : I I I I I 

1319. SAP E VP AEVAKVLAS GNWT PESP S PA 1344 

1632 PRLVREPVRCTCSAQGTGF S C P S S VGGH P PQMRW- T GD I LT DI TGHNVS E YLL FT S 1687 

III I I I : : I I I I I : | : : : : : I | | : | : : I : I 

1345 CQCSQPGARRLLPDCPAAAGGPPPPQAVTGSGEWQNLTGRNLSDFLVKTY 1395 

1688 DRFRLH RYGAITFG NVL 1704 

I I I I : I II 

1396 PRLVRQGLKTKKWVNEVRYGGFSLGGRDPGLPSGQELGRSVEELWALLSPLPGGALDRVL 1455 

1705 K S I PAS FGT RAP PMVRK I AVRRAAQVF YNN KGYHSMPTYLNS LNNAI L RAN L P K S KGN PA 1764 

I : : I : :::::: I I I I : I I I : : I : I I I I I I : I I : 

1456 KNLTA WAHSLDAQDSLKIWFNNKGWHSMVAFVNRASNAILRAHLPPGRAR-H 1506 



1765 AYGITVTNHPMNKTSASL-SLDYLLQGTDWIAIFIIVAMSFVPASFVVFLVAEKSTKAK 1823 
I : II 111:11 I : I I : : : I : : I I I I I I I I I : I : I : I : I I 



Db 



1507 AHS ITTLNHPLNLTKEQLFEAALMAS S VDVLVS I C WFAMS FVPAS FTLVLI EERVTRAK 1566 



Qy 1824 HLQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPAT CCVI I L FVFDL PAYT S PTN FPAVL S L FLL Y 1883 

I ! I : I : I : I I I I : : I I I I I I I I I hi I I I : I I I I : I I' I I I 
Db 1567 HLQLMGGLSPTLYWLGNFLWDMCNYLVPACIVVLIFLAFQQRAYVAPANLPALLLLLLLY 1626 

Qy 1884 GWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLK 1943 

I I I I I I : I I I I I I : I llhlll I I I I I I I I : : I I I : I : I I I : I : I : II 

Db 1627 GWSITPLMYPASFFFSVPSTAYWLTCINLFIGINGSMATFVLELFS-DQKLQEVSRILK 1685 

Qy 1944 SCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWG 2003 

I I I I I : : II I I : : I I :::::: I : : I I I : : I : I : I I : : I : 
Db 1686 QVFL I FP H FC LGRGL I DMVRNQAMADAFERLGD- RQ FQ S P LRWE WGKN L LAMVI QG P L F 1744 

Qy 2004 FLLTIMCQYNFLRRPQRMP VS T K P V — ED D VD VAS E RQ RVL RG DADN DMVK I EN LT K 2058 

I I:: I: I I : I I : I : I : I I I I I I : I I : : I I : : : I I I I 

Db 1745 LLFTLLLQH RSQLLPQPRVRSLPLLGEEDEDVARERERWQGATQGDVLVLRNLTK 1800 

Qy 2059 VYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHS 2118 

II:: I : I I I I I I I I : I I I I I I I I I I I I I I I I I I I : I : I I I : I I I : I I I 
Db 1801 VYRGQ RMPAVDRLCLGIPPGECFGLLGVNGAGKTSTFRMVTGDTLASRGEAVLAGHS 1857 

Qy 2119 VLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTK 2178 

I : I I : I I I I I I I : I : I I I I I I : I MM: I : I : I I : 

Db 1858 VAREPSAAHLSMGYCPQSDAI FELLTGREHLELLARLRGVPEAQVAQTAGSGLARLGLSW 1917 

Qy 2179 YADKPAGTYSGGNKRKLSTAIALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLIKTGRS 2238 

I I I : I I I I I I II I II I I : II : II : I I I : I M II I I I I II I I I I I II : I ■ : : : Ml 
Db 1918 YADRPAGT YS GGNKRKLATALALVGDPAWFLDEPTTGMD P SARRFLWNS LLAWREGRS 1977 

Qy 2239 WLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDWR 2298 

I : I I I I I I II I II I I : I I I I I I I I I I II I I I II I I I I : : I : I : : : I : 

Db 1978 VMLTSHSMEECEM.CSRLAIMVNGRFRCLGSPQHLKGRFAAGHTLTLRVPAARS-QPAAA 2036 

Qy 2299 FFNRNFPEAMLKERHHTKVQYQL-KSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNV 2357 

I I I : I : I I : : : : I I : I I : I I : : I : I I : I I I I I I : I 

Db 2037 FVAAEFPGSELREAHGGRLRFQLPPGGRCALARVFGELAVHGAEHGVEDFSVSQTMLEEV 2096 

Qy 2358 FVNFAK KQSDNLEQQE TEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

I : I : I IIIMI : I III I I : I I I 

Db 2097 FLYFSKDQGKDEDTEEQKEAGVGVDPAPGLQHPKRVSQFLDDPSTAETVL 2146 



RESULT 13 
Q96S58 

ID Q96S58 PRELIMINARY; PRT; 2008 AA. 

AC Q96S58; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel . 19, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ABCA-SSN. 

GN ABCA7 /ABCA-SSN. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 



RP SEQUENCE FROM N.A. 

RX MEDLINE=21255283; PubMed=11355874 ; 

RA Tanaka A. , Ikeda Y., Abe-Dohmae S., Arakawa R. , Sadanami K. , 

RA Kidera A., Nakagawa S., Nagase T., Aoki R. , Kioka N., Amachi T., 

RA Yokoyama S., Ueda K. ; 

RT "Human ABCA1 Contains a Large Amino-Terminal Extracellular Domain 

RT Homologous to an Epitope of Sjogren's Syndrome."; 

RL Biochem. Biophys . Res. Commun. 283:1019-1025(2001). 

DR EMBL; AB055390; BAB62294.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; ABC_TRANSPORTER_2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2008 AA; 218617 MW; 226FF85C24230B90 CRC64; 

Query Match 27.7%; Score 3515; DB 4; Length 2008; 

Best Local Similarity 37.3%; Pred. No. 7.5e-224; 

Matches 840; Conservative 351; Mismatches 699; Indels 360; Gaps 56; 

Qy 268 VC — S GQAAARARRFS GLS AELRNQLDVAKVSQQLGLDAPNGS DS S PQAP P PRRLQALLG 325 

I I : I I : I I : : I : I I I I I I t I I 

Db 2 VCLGTGQSA— GPLVSVQNHCPPCGLSPQESLGLALGQAQEP LHSLLE 47 

Qy 326 DL LD- AQ KVLQ DVDVL S ALAL L - L PQ GACT GRT P G P PAS GAGGAAN GT GAGAVMG PN AT A 383 

I I I : : I : : III I : I III : : I : : I I : 

Db 4 8 AAEDLAQELLALRSLVELRALLQRPRG TSGPLELLSEALCSVRGPSSTVGPSLNW 102 

Qy 384 EEGAPSAAAL ATPD-TLQGQCSAFV QLWAGLQPI LCGNNRT I EP 426 

I : : I I I : I I I : I I I : I : : I 
Db 103 YEASDLMELVGQEPESALPDSSLSPACSELIGALDSHPLSRLLWRRLKPLILG 155 

Qy 427 EAL RRGNMS S LG FT S KEQ RN L GL L VH LMT S N P K I L YAP AG S E VD RVI L KAN ET FAFVGNV 486 

I : I : I I : : : : | | | : : 

Db 156 KLLFAPDTPFTRKLMAQVNRTFEELTLL 183 

Qy 487 THYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHP EALNLSLDELP 538 

: I I : I : I : I : I I III III II 

Db 184 RDVREVWEMLGPRIFTFMNDSSNVAMLQRLLQMQDEGRRQPRPGGRDHMEALRSFLDP — 241 

Qy 539 PALRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSK-VSVDIFKGFPDEESIVNYTLNQA 597 

II I I : : : : : I : I : I I : : I : I 

Db 242 GS GGYSWQDAHADVGHLVGTLGRVTECLS LDKLEAAP S EAALVS RALQLL 291 

Qy 598 YQDNVT VFAS VI FQT RKDG S L P P HVHYKIRQNSSFTEKTNEIRRAYWRPG 647 

: : I I : I : I I I Mill: : I I : I I : I I I 

Db 292 AEHR — FWAGWFLGPEDSSDPTEHPTPDLGPGHVRIKIRMDIDWTRTNKIRDRFWDPG 349 

Qy 648 PN TGGRFYFLYGFVWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFV 703 



Db 



I I II I II : : I I : : I I I : I : I I : I Mill I II I 

350 PAADPLTDLR-YVWGGFVYLQDLVERAAVRVLSGAN-PRAGLYLQQMPYPCYVDDVFLRV 4 07 



QY 704 IEHmPLCMVISWVYSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSI 763 

' : I I : : : I : I I I : I :: : I I I I I I :: I : I I I : I I I : I I : : : 
Db 408 LSRSLPLFLTl^WIYSWLTVKAWREKETRLRDTMRAMGLSRAVLWLGWFLSCLGPFLL 467 

Qy 764 S VTAIjT AI LK YGQVLMH S HWI I WL FLAVYAVAT I MFC FLVS VL Y S KAKLAS ACGG 1 1 YF 823 

I I : I I I : I : I I : : : I I I I : I I | | : | | : | : I : I I I : I I I I : I I 
Db 468 SAALLVLVLKLGDILPYSHPGWFLFLAAFAVATVTQSFLLSAFFSRANLAAACGGLAYF 527 

Qy 82 4 LSYVPY-MYVAI REEVAHDKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFS 882 

I : I I : I I I : : I : I I I : I Mil: Ml I I I I I 

Db 528 SLYLPYVLCVAWR DRLPAGGRVAASLLSPVAFGFGCESLALLEEQGEGAQWHNVG 582 

Qy 883 QSPVEGDDFNLLIAVTMLMVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGR 942 

I I Ml : M M I : I M I I I : I II. II I I : I I I I I : : I I I I I 

Db 583 TRPT-ADVFSLAQVSGLLLLDAALYGLATWYLEAVCPGQYGIPEPWNFPFRRSYWCGP-R 640 

QY 943 TEAWEWSWPWARTPRLSVMEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKD 1002 

I I : : I I I I I I I I I : 

Db 641 PPKSPAPCPTPLDPKVLV EEAPPGLSPGVSVRSLEKRFPG 680 

Qy 1003 DKKLALNKLS LNLYENQWS FLGHNGAGKTTTMS I LTGLFP PT S GSAT I YGHDI RTEMDE 1062 

: I I MM M : : I I I I I II I I || |: I I M I II I I : I I I I II M I : I 
Db 681 SPQPALRGLSLDFYQGHITAFLGHNGAGKTTTLSILSGLFPPSGGSAFILGHDVRSSMAA 74 0 

Qy 1063 IRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQ 1122 

I I : I M I I M II II I II I : I I : II I II I : : : | | : : : : M I : I : : 
Db 741 IRPHLGVCPQYNVLFDMLTVDEHVWFYGRLKGLSAAWGPEQDRLLQDVGLVSKQSVQTR 800 

Qy 1123 TLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHM 1182 

I I M M I I I I II I I I I I I I : : I I II I II I II I Ml I I: M I I I : II MM I I I M 
Db 801 HLSGGMQRKLSVAIAFVGGSQWILDEPTAGVDPASRRGIWELLLKYREGRTLILSTHHL 860 

Qy 1183 DEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKR — PAEPGGPQEPGLASSP 1240 

I I I : I I I I I : I ::: Ml MIIIMM I II Mill | : : | 

Db 8 61 DEAELLGDRVAWAGGRLCCCGSPLFLRRHLGSGYYLTLVKARLPLTTNEKADTDMEGSV 920 

Qy 1241 PGRAPLSSCSE LQVSQFIRKHVASCLLVS DTSTELSYILPSEAAKKGAFERLFQ 1294 

I • M M :: I II : , M Ml I Ml IM 

Db 921 DTRQEKKNGSQGSRVGTPQLLALVQHWVPGARLVEELPHELVLVLPYTG7VHDGSFATLFR 980 

Qy 12 95 HLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGE 1354 

M I I I : : | : I I : I I | : I M I M II M Ml 

Db 981 ELDTRLAELRLTGYGISDTSLEEIFLKWEE CAADT DMEDGSCGQHLCT 1029 

Qy 1355 GHAG-NLARCSELTQSQASLQSASSVGSA-RGDEGAGYTDVYGDYRPLFDNPQDPDNVSL 1412 

Ml:: : : : : I :: Ml M I : I III 
Db 1030 GIAGLDVTLRLKMPPQETALENGEPAGSAPETDQGSG PDAVG- 1071 

Qy 1413 QEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVA 1472 

M MM : I I |: I I I II I : : I M I M II I I I : I 

Db 1072 -RVQGWALTR QQLQALLLKRFLLARRSRRGLFAQIVLPALFVGLA 1115 

Qy 1473 MTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSP-DASPQQLVS 1531 

: : I M I I I I I I : I I : : : I : Ml : I : 



1116 LVFSLIVPPFGHYPALRLSPTMY GAQVSFFSED APGDPGRARLLE 1160 



1532 TFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLL7yVRFFDSMCLESFTQGLPLSNFVP 1591 

= I I : I I : I II : I I : I 

1161 ALLQEAG LEEP PVQHSSH RF S AP EVPAE VAKVL AS GNWT P 1200 

1592 P P P S PAP S D S PAS P DE D LQAWNVS L P P TAG P EMWT SAP S L P RLVRE P VRCT C S AQGT G F- 1650 

M I I I I I I 

1201 ESPSPA CQCSQPGARRL 1217 

1651 —SCPSSVGGHPPQMRW-TGDILTDITGHNVSEYLLFTSDRFRLH RYG 1696 

I I :: I I I I I : | : : : : : | | | : | : : | : | | | | | 

1218 LPDCPAAAGGPPPPQAVTGSGEWQNLTGRNLSDFLVKTYPRLVRQGLKTKKWVNEVRYG 1277 

1697 AITFG NVLKS I PAS FGTRAP PMVRKI AV 1724 

: I I I I :: I : 
1278 GFSLGGRDPGLPSGQELGRSVEELWALLSPLPGGALDRVLKNLTA WAHSLDA 1329 

1725 RRAAQVFYNNKGYHSMPTYLNSLNNAIL-RANLPKSKGNPAAYGITVTNHPMNKTSASL-S 1783 

:::::: I I II : I I I : : I : I I I I I I : I I : I : I I I I I : I I | 

1330 QDSLKIWFNNKGWHSMVAFVNRASNAILRAHLPPGRAR-HAHSITTLNHPLNLTKEQLFE 1388 

1784 LDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWL^YVW 1843 

: I I : : : I : : I I I I I II II : I : I : I : I I I I I : I : I : I I I I : : I 
1389 AALMASSVDVLVS I CWFAMS FVPAS FTLVLI EERVTRAKHLQLMGGLS PTLYWLGNFLW 144 8 

1844 DMLN YLVPATCCVI I L FVFDLPAYT S PTNFPAVLS LFLLYGWS I TPIMYPAS FWFEVP S S 1903 

I I I I I I I I 1:1 I I I : I I I I : I I I I I I I I I I I : I I I I I I : I I I I : 
1449 DMCNYLVPACIWLIFLAFQQRAYVAPANLPALLLLLLLYGWSITPLMYPASFFFSVPST 1508 

1904 AYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAY 1963 

I I I I I I II I I I : : I I I : I : II I : I : I : | | I I I I I : : I I I I : : I 
1509 AYVVLTCINLFIGINGSMATFVLELFS-DQKLQEVSRILKQVFLIFPHFCLGRGLIDMVR 1567 

1964 NEYINEYYTVKIGQFDKMKSPFEWDIVTRGLVT^AVEGWGFLLTIMCQYNFLRRPQRMP- 2022 

I :::::: I : : I I I : : I : I : I I : : I : I I : : I : I I : I 

1568 NQAMADAFERLGD-RQFQSPLRWEWGKNLLAMVIQGPLFLLFTLLLQH RSQLLPQ 1622 

2023 — VSTKPV — EDDVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGV 2078 

I : I : I : I I I I I I : I I : : I I : : : MINI: : I : MINIM: 
1623 PRVRSLPLLGEEDEDVARERERWQGATQGDVLVLRNLTKVYRGQ RMPAVDRLCLGI 1679 

2079 RPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDA 2138 

I I I I I I I I I I I I I I I I I I I : I : I I I : Ml : I I I I M I : I I I I I II 

1680 PPGECFGLLGVNGAGKTSTFRMVTGDTLASRGEAVLAGHSVAREPSAAHLSMGYCPQSDA 1739 

2139 LFDELTAREHLQLYTRLRGISWKDEARVVKWALEKLELTKY7VDKPAGTYSGGNKRKLSTA 2198 

: I : I I I I I I : I Mil: I : I : I I : I I I : I I I I I I I I I I I I I : I I 

1740 IFELLTGREHLELLARLRGVPEAQVAQTAGSGLARLGLSWYADRPAGTYSGGNKRKLATA 1799 

2199 IALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAI 2258 

M M I II : I I I I I I I I I I I I I I I I I I : I ::: I I I I : I I I I I I I I I I I I I : I I I I 
1800 LALVGDPAWFLDEPTTGMDPSARRFLWNSLLAWREGRSVMLTSHSMEECEALCSRLAI 1859 

2259 MVNGRLRCLGSIQHLKNRFGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 

I I I I I I I I I I I I I I I I I : : I : I : : : I : I I I : I : I I : : : 

1860 MVNGRFRCLGSPQHLKGRFAAGHTLTLRVPAARS-QPAAAFVAAEFPGSELREAHGGRLR 1918 



Qy 2319 YQL-KSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAK KQSDNLEQQE- 2373 

: I I : I I : I I : : I : I I : I I I I I I : I I : I : I I I I I : I 

Db 1919 FQLPPGGRCALARVFGELAVHGAEHGVEDFSVSQTMLEEVFLYFSKDQGKDEDTEEQKEA 1978 

Qy 2374 TEPPSALQSPLGCLLSLLRPRSAPTEL 2400 

: I III I I : I I I 

Db 1979 GVGVDPAPGLQHPKRVSQFLDDPSTAETVL 2008 



RESULT 14 
Q86UK0 

ID Q86UK0 PRELIMINARY; PRT; 2595 AA. 

AC Q86UK0; 

DT 01-JUN-2003 (TrEMBLrel. 24, Created) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE ABCA12 transporter subfamily A. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22583451; PubMed=12697999; 

RA Annilo T., Shulemin S., Chen Z.Q., Arnould I., Prades C, Lemoine C, 

RA Maintoux C, Devaud C. r Dean M. , Denefle P . , Rosier M. ; 

RT "Identification and characterization of a novel ABC A subfamily member, 

RT ABCA12, located in the lamellar ichthyosis region on 2q34."; 

RL Cytogenet. Genome Res. 98:169-176(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Annilo T., Shulenin S., Chen Z.Q., Arnould I., Prades C, Lemoine C, 

RA Maintoux C, Devaud C, Dean M. , Denefle P., Rosier M. ; 

RL Submitted (JAN-2 003) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AY219711; AAP21093.1; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F: ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO: 0000166; F:nucleotide binding; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase . 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; ABC_TRANSPORTER_2 ; 2. 

SQ SEQUENCE 2595 AA; 293148 MW; A771C73A4276A238 CRC64 ; 

Query Match 22.4%; Score 2831.5; DB 4; Length 2595; 

Best Local Similarity 30.1%; Pred. No. 3.3e-178; 

Matches 740; Conservative 432; Mismatches 814; Indels 469; Gaps 66; 

Qy 93 VTQLL- — ERLDRWEEGNLFDPARPSLGSELEALRQHLEALSAGPGTSGSHLDRSTVSS 149 

: I : I I I : : I I : I I I I I I I I I : I I II 
Db 436 LTELLCESETFSLIEKSCQLSDMSFGSLCEESEFDLQLLEAAELGTEIAASLLYHDNVIS 495 



Qy 150 FSL-DSVARNPQELWRFLTQNLSLPNSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLH 208 

: i : : I :: | | : : | | | : : : | : : I : 

Db 496 KKVRDLLTGDPSKI NLNMDQFLEQALQMNYLE — NITQLIPII EAMLHVNN SAD 547 

Qy 209 KGQEPWSRLGGNPLFRMEELLLAPALLEQLTCTPGSGE--LGRILTVPESQKGA 260 

I I = I : I I I I I I : :: I : I I 

Db 548 ASEKPGQLL EMFKNVE ELKEDLRRTTGMSNRTI DKLLAI PI PDNRAEI I SQV 599 

Qy 261 LQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQLGLDAPN 306 

I : I : : I : I : : I : I : : : : : : 

Db 600 FWLHSCDTNITTPKLEDAMKEFCNLSLSERSRQSYLIGLTLLHYLNIYNFTDKVFFPRKD 659 

Qy 307 GSDSSPQAPPPR RLQALLGD LLDAQKVLQDVDVL SALALL 34 6 

II: I I : : I III : | : : : : : | 

Db 660 QKPVEKMMELFIRLKEILNQMASGTHPLLDKMRSLKQMHLPRSVPLTQAMYRSN 713 

Qy 347 L PQ GACT GRT PGP PAS GAG GAAN GTGAGAVMG PN AT AE EGAP S A 390 

III::: : I | | | : : : | : | 

Db 714 RMNTPQGSFSTISQALCSQGI TTEYLTAMLPSSQRPKGNHTKDFLTYKLTKEQIA 768 

Qy 391 AALATPDTLQGQCSAFVQ LWAGLQPILCGNNRTIEPEALRRGNMSSLGFT 440 

: I I : : : I I I : I : I I 
Db 769 SKYGIPINTTPFCFSLYKDIINMPAGPVIWAFLKPMLLG 807 

Qy 441 SKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETFAFVGNVTHYAQVWLNIS-AE 499 

: I I : I I : : I : I I : : : I I : : I 

Db 808 RILHAPYNPVTKAIMEKSNVTLRQLAELREKSQEWMDKSPLF 84 9 

Qy 500 IRSF — LEQG — RLQQHLR — WLQQYVA-ELRLHPEALNLSLDELPPALRQDNFSLPSGM 552 

: II I I II II ::| :| : I I :||l II I- : : 

Db 850 MNS FHLLNQAI PMLQNTLRNPFVQVFVKFSVGLDAVELLKQI DEL- DI LR LKLENNI 905 

Qy 553 ALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQA -Y 598 

:: I I : I : I : : I : I III II I 

Db 906 DIIDQLNT LSSLTVNI S SCVLYDRIQAAKT I DEMEREAKRLY 947 

Qy 599 QDNVTVFASVI FQTRKDGS LPPHVHYKIRQNSSFTEKTNEIRRAYWRPG 647 

: I : I I I I I : : I I I I : I I I : : I : I III 

Db 948 KSN-ELFGSVIFKLPSNRSWHRGYDSGNVFLPPVIKYTIRMSLKTAQTTRSLRTKIWAPG 1006 

Qy 64 8 PNTGGRFYFLYG— FVWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIE 7 05 

I: :M I ::: I I : I I I I I: I : I II I I I I :. : I : I I ': 

Db 1007 PHNS PSHNQI YGRAFI YLQDS I ERAI I ELQTGRNSQEI AVQVQAI PYPCFMKDNFLTSVS 1066 

Qy 706 HMMPLCMVISWWSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITG 765 

: :|: ::::|| :| : : : I I I : I I I I I I I : I : hill I ::: 

Db 1067 YSLP I VIJWAWVVFI AAFVKKLVYEKDLRLHE Y I 1126 

Qy 766 TALTAILKYGQVI^HSHVVIIWLFLAW 825 

I I I I : I : I : : I : : I : : I : : I : I : | | : : : | : | :| | : : 
Db 1127 VILIIILKFGNILPKTNGFILFLYFSDYSFSVIAMSYLISVFFNNTNIAALIGSLIYIIA 1186 

Qy 826 YVPYMYVAI REEVAHDKITAFEKCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSP 885 

: I : : : I : : : : I I I : I I I I hi I I I : I : I I I I 

Db 1187 FFPFIVLVTVE NELSYVLKVFMSLLSPTAFSYASQYIARYEEQGIGLQWENMYTSP 1242 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 



886 VEGDDFNLLLAVTMLMVDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYW LGSGR 942 

I : I : : : : | : : | : : | | : I II I I : Mill: III I 

1243 VQDDTTS FGWLCCLI LADS FIYFLIAWYVRNVFPGTYGMAAPWYFPI LPS YWKERFGCAE 1302 

943 TEAWEWSWPWARTPRLS VMEEDQACAMESRRFEETRGMEEEPTHLPLWCVD 994 

: II : I : I I : ' : I I I I : I : 

1303 VK PEKSNGLMFTNIMMQNTNPSASPEYMF--SSNIEPEPKDLTVGVALH 134 9 

995 KLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGH 1054 
: I I : I | : | : : | : | | | I : I I I I I I I I I I I : I : I I I I I : : I : : I I 

1350 GVTKIY--GSKVAVDNLNLNFYEGHITSLLGPNGAGKTTTISMLTGLFGASAGTIFVYGK 1407 

1055 DIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLK — SMAQEEI RREMDKMI EDLE 1112 

I I : I : : : I I I : I : I I I : I I I I I : I I I I : I : : : : I : : : : I 

1408 DIKTDLHTVRKNMGVCMQHDVLFSYLTTKEHLLLYGSIKVPHWTKKQLHEEVKRTLKDTG 1467 

1113 LSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPG 1172 

I : II I I I I I I I I I II I :: I I : I I I I : I I I I I : I I I I : I I : I I I : I I I 
1468 LYSHRHKRVGTLSGGMKRKLSISIALIGGSRWILDEPSTGVDPCSRRSIWDVISKNKTA 1527 

1173 RTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQ 1232 

I I I : I I I I I : I I I :: I I I I I : I I : I I I I I : I I : I I I I I I I I : : 
1528 RTI I LSTHHLDEAEVLS DRI AFLEQGGLRCCGS PFYLKEAFGDGYHLTLTKK K 1580 

1233 EPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAK- KGAFER 1291 

II:: : I : I : I : I : I I I I I : I I : I I I : 

1581 SPNLNAN AVCDTMAVTAMI Q S H L P EAYLKED I GGELVYVL P P FS T KVS GAYLS 



1292 



1633 



1351 



LFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPA 
.1:1: : I : : : I : I I I : I I I I I : : : I I : j I : : 
1634 LLRALDNGMGDLNIGCYGISDTTVEEVFLNLTKESQ— KNSAMSLE : 1677 

1352 SGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVS 1411 

III : :|: : : ||::| 
1678 HLTQKKI GNSNANGI ST /- PDDLS 1699 

1412 LQ EVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPA 1466 

: : : : I : I : I M | : : : : I : I I I I I I I I I : I : : I I 

1700 VSSSNFTDRDDKILTR GERLDGFGLLLKKIMAI LI KRFHHTRRNWKGLIAQVI LPI 1755 

1467 FFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSPDASP 1526 

MM: ' I : : II I I I : I III | 
1756 VFVTTAMGLGTLRNSSNSYPEIQISPSLYG — TSEQTAF — YANYH PST 1800 

1527 QQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPL 1586 

: I I I I : Mill: 
1801 EALVSAMWDFPGI DNMCLNT 1820 

1587 SNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVR CT 1642 

II : I : II I II: |: 

1821 SDLQCLNKDSLEKWNTS GEPITNFGVCS 1848 

1643 CSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAITFGN 1702 

II M II I : : : : : I I I II : I : : I I I I Ml 

184 9 CSENVQ — ECP-KFNYSPPHRRTYSSQVI YNLTGQRVENYLISTANEFVQKRYGGWSFGL 1905 



1703 VL- 



- KS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYH SMPT YLN S LNNAI LRA 1754 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



I : I I : 

1906 PLTKDLRFDITGVPAN- 



I I : I : I : : I I I I : I I I I I I I I : I I 

-RTLAKVWYDPEGYHSLPAYLNSLNNFLLRV 1951 



1755 NLPKSKGNPAAYGITVTNHP MNKTSASLSLDYLLQGTDWIAI FI IVAMSFVPASF 1810 

I : I I : I : I I : : I I : : I : : I I : : : I : ! : : I III 

1952 NM--SKYDAARHGIIMYSHPYPGVQDQEQATIS S L I D I LVALS I LMG Y S VTTAS F 2004 

1811 WFLVAEKSTKAKHLQFVSGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSP 1870 

I ::| I 111111:11 ||: |:::||: I I I I : |: :| |||: | 

2005 VTYWREHQTKAKQLQHISGIGWCYWVTNFIYDMVFYLVPVAFSIGIIAIFKLPAFYSE 2064 

1871 TNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITA TVATFLL 1926 

I II I I I : I : II : I I : : : : I I I I I : : I II 

2065 NNLGAVSLLLLLFGHATFSWMYLLAGLFHETGMAFITYVCVNLFFGINSIVSLSWYFLS 2124 

1927 QLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEW 1986 

: : I I : : : : II Mill: I : I I : I : : : : : : I : II 

2125 KEKPNDPTLELISETLKRIFLIFPQFCFGYGLIELSQQQSVLDFLKAYG-VEYPNETFEM 2183 

1987 DIVTRGLVAMAVEGWGFLLTIMCQYNFLR RPQRMPVSTKPVEDDVDVASERQR 2040 

: : | | : : | : | | : : : : : | : : : : | | | : | | | 

2184 NKLG7\MFVALVSQGTMFFSLRLLINESLIKKLRLFFRKFNSSHVRETIDEDEDVRAERLR 2243 

2041 VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 

I I I : I : I : : I I I I : : I : I i : : : I : I I I I I I I I I I I I I I I : I I I 

2244 VESGAAEFDLVQLYCLTKTYQLIH-KKIIAVNNISIGIPAGECFGLLGVNGAGKTTIFKM 2302 

2101 LTGD ESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQ 2150 

I I I I- : I I I : I I I : I I I I I I I I I : I III 

2303 LTGDIIPSSGNILIRNKTGSLGHVDSHSSL VGYCPQEDALDDLVTVEEHLY 2353 

2151 LYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFL 2210 

I I : I I II I I : I I : I : I I I I I II I I : I I I I I : : I 

2354 FYARVHGIPEKDIKETVHKLLRRLHLMPFKDRATSMCSYGTKRKLSTALALIGKPSILLL 2413 

2211 DEPTTGMDPKARRFLWNLILDLIKTGRSVVLTSHSMEECEALCTRLAIMVNGRLRCLGSI 2270 

I I I :: I I I I I : : I II : I : :: I I : I I I I I I I I I I I I I I I I I I I I I I : : I : I I : 
2414 DEPSSGMDPKSKRHLWKIISEEVQNKCSVI LTSHSMEECEALCTRLAIMVNGKFQCIGSL 2473 

2271 QHLKNRFGDGYMITVRTKSSQ-SVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLA 2329 

I I : I : I I I I : : I I :::::::: I : I | : | | : : | : : : | : : | 

2474 QHIKSRFGRGFTVKVHLKNNKVTMETLTKFMQLHFPKTYLKDQHLSMLEYHVPVTAGGVA 2533 

2330 QVFSKMEQVSGVLGIEDYSVSQTTLDWFWFAKKQSDNLEQQETEPPSALQSPL 2384 

: I : I I I :: I I I I I I : I I : I I I I I : III: I : 

2534 NIFDLLETNKTALNITNFLVSQTTLEEVFINFAKDQ KSYETADTSSQGSTI 2584 



PRELIMINARY; 



PRT; 2347 AA. 



RESULT. 15 
Q8IZW6 
ID Q8IZW6 
AC Q8IZW6; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 
DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 
DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 
DE ATP-binding cassette transporter family A member 12. 
OS Homo sapiens (Human) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Schaap F.G., van Wijland M. , Groen A.K.; 

RT "Cloning of a novel ABC transporter (ABCA12) tentatively involved in 

RT lipid homeostatis . " ; 

RL Submitted (SEP-2001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AF418105; AAN40735.1; 

DR Genew; HGNC: 14 637; ABCA12. 

DR GO; GO: 0016020; Crmembrane; IEA. 

DR GO; GO:0005524; F: ATP binding; IEA. 

DR GO; GO: 0004009; F : ATP-binding cassette (ABC) transporter acti. . .; IEA. 

DR GO; GO:0000166; F: nucleotide binding; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC_transporter ; 2. 

DR SMART ; SM00382; AAA; 2. 

DR PROSITE; PS 002 11; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; AB C_T RAN S PORT ER_2 ; 2. 

KW ATP-binding. 

SQ SEQUENCE 2347 AA; 264963 MW; 9B6E13FD0F0F67AD CRC64; 



Query Match 22.3%; Score 2827.5; DB 4; Length 2347; 

Best Local Similarity 30.2%; Pred. No. 5.1e-178; 

Matches 741; Conservative 432; Mismatches 813; Indels 469; Gaps 66; 

Qy 93 VTQLL ERLDRWEEGNLFDPARPSLGSELEALRQHLEALSAGPGTSGSHLDRSTVSS 149 

: I : I I I : : I I : I I I I I III I : I I II 

Db 188 LTELLCESETFSLIEKSCQLSDMSFGSLCEESEFDLQLLEAAELGTEIAASLLYHDNVIS 247 

Qy 150 FSL-DSV7VRNPQELWRFLTQNLSLPNSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLH 208 

: I : : I : : I I : : III : : : I : : I : 

Db 248 KKVRDLLTGDPSKI NLNMDQFLEQALQMNYLE — NITQLIPIIEAMLHVNNSAD 299 

Qy 209 KGQEPWSRLGGNPLFRMEELLLAPALLEQLTCTPGSGE — LGRILTVPESQKGA 260 

: : I I : I : I I I I I I : : : I : I I 

Db 300 ASEKPGQLL EMFKNVE ELKEDLRRTTGMSNRTI DKLLAI P I PDNRAEI I SQV 351 

Qy 261 LQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQLGLDAPN 306 

I : | : : I : I : : I : I : : : : : : 

Db 352 FWLHSCDTNITTPKLEDAMKEFCNLSLSERSRQSYLIGLTLLHYLNIYNFTYKVFFPRKD 411 

Qy 307 GSDSSPQAPPPR- RLQALLGD LLDAQKVLQDVDVLSALALL 346 



Db 412 QKPVEKMMELFIRLKEILNQMASGTHPLLDKMRSLKQMHLPRSVPLTQAMYRSN 465 

Qy 347 L P Q GACT G RT P G P PAS GAGGAAN GT GAGAVMG PNAT AEEGAP S A 390 

III::: : I I I I : : : I : I 

Db 466 RMNTPQGSFSTISQALCSQGI TTEYLTAMLPSSQRPKGNHTKDFLTYKLTKEQIA 520 

Qy 391 AALATPDTLQGQCSAFVQ LWAGLQ P I LCGNN RT I E P EAL RRGNMS S LG FT 440 

: I I : : : I I I : I : I I 
Db 521 SKYGIPINTTPFCFSLYKDIINMPAGPVIWAFLKPMLLG 559 



441 SKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETFAFVGNVTHYAQVWLNIS-AE 499 

:||||| : : I : I I : : : I I : : I 

560 RILYAPYNPVTKAIMEKSNVTLRQLAELREKSQEWMDKSPLF 601 

500 IRSF — LEQG — RLQQHLR — WLQQYVA-ELRLHPEALNLSLDELPPALRQDNFSLPSGM 552 

: I I I I I I I I :: I : I : I I : I I I II I : : 
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I : : I I I : : : I I : I I II I : I : I I I I I I I : : I : I I : 
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706 HMMPLCMVISWVYSVAMTIQHIVAEKEH 765 

: : I : : : : : I I : I : : : I I I : I I I I I I I : I : hill . I : : : 
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1527 QQLVSTFRLPSGVGATCVLKSP7\NGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPL 1586 

: III h hill : 

1553 EALVSAMWDFPGI DNMCLNT 1572 
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Db 2286 NIFDLLETNKTALNITNFLVSQTTLEEVFINFAKDQ KSYETADTSSQGSTI 2336 
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Title: 
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Scoring table: 



US-10-088-467-2 
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1 MG FLHQLQLLLWKN VT LKRR GLISFEEERAQLSFNTDTLC 2436 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 



141681 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
ABC2_HUMAN 

ID ABC2_HUMAN ST7AND7\RD; PRT; 2436 AA. 

AC Q9BZC7; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE ATP-binding cassette, sub-family A, member 2 (ATP-binding cassette 

DE transporter 2) (ATP-binding cassette 2) . 

GN ABCA2 OR ABC2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX PubMed=l 1 17 898 8; 

RA Kaminski W.E., Piehler A., Pullmann K., Porsch-Oezcueruemez M. , 

RA Duong C, Bared G.M., Buchler C, Schmitz G. ; 

RT "Complete coding sequence, promoter region, and genomic structure of 

RT the human ABCA2 gene and evidence for s terol-dependent regulation in 

RT macrophages."; 

RL Biochem. Biophys . Res. Commun. 281:249-258(2001). 

CC -!- FUNCTION: Probable transporter, its natural substrate has not been 



cc 


found yet . 


May have a role in macrophage lipid metabolism and 


cc 


neural development. 




CC 


-!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 


CC 
cc 
cc 


-!- SIMILARITY 


Belongs to 


the ABC transporter family. ABC A subfamily. 


This 


SWISS-PROT entry is copyright. It is produced through a collaboration 


CC 


between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 


cc 


the European Bioinf ormatics Institute. There are no restrictions on its 




use 


by non-profit institutions as long as its content is in no way 


CC 


modified and this statement is not removed. Usage by and for commercial 


CC 


entities requires a license agreement (See http://www.isb-sib.ch/announce/ 


cc 
cc 

DR 


or send an email to license@isb-sib . ch) . 
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.1; 
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AAK14335 
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MIM; 600047; 
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GO; GO:0016021; 
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: integral to membrane; NAS. 
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GO; GO: 0004009; 


F:ATP-binding cassette 


(ABC) transporter acti. . 


UK 


GO; GO: 0006629; 


P 


: lipid metabolism; NAS. 




UK 


GO; GO: 0006810; 


P 


: transport; 


NAS. 






HR 


InterPro; 


IPR003593; AAA ATPase. 






HR 


InterPro; 


IPR003439; ABC 


transporter. 






DR 


Pfam; PF00005; ABC tran; 


2. 








UK 


ProDom; PD000006; 


ABC transporter; 2. 
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SMART; SM00382; 


AAA; 2. 










UK 


PROSITE; 


PS00211; 


ABC_T RAN S PORTER 1; 1 






UK 


PROSITE; 


PS50893; 


ABC TRANSPORTER 2; 2 








ATP-binding ; Transport ; Transmembrane ; 


Repeat; 


Glycoprotein . 
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SQ SEQUENCE 2436 AA; 269971 MW; 9E6688D8615DE06D CRC64; 



Query Match 99.9%; Score 12658; DB 1; Length 2436; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 2435; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


1 


MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 






-L 


MGFLHQLQLLLWKN VTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVP FYTAA 60 


Qy 


61 


PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UJJ 




PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRWEEGNLFDPARPSLGSE 


120 


Qy 


121 


LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 


180 






1 M 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 




UD 


1 Ol 

1Z 1 


LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 


180 


Qy 


181 


LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 


1 Q 1 


LAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 24 0 


Qy 


241 


TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 


300 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M I | | I | | | | | | 




nK 
UD 


Z 4 1 


TPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDVAKVSQQL 


300 


Qy 


301 


GLDAPNGS DS S PQAP P PRRLQALLGDLLDAQKVLQDVDVLS ALALLLPQGACTGRT PGP P 


360 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 .1 1 1 1 1 1 1 1 1 1 1 1 1 1 |l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




JJD 


Jul 


GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSALALLLPQGACTGRTPGPP 


360 


Qy 


361 


AS GAGGAANGTGAGAVMGPNATAEEGAP S AAALAT P DTLQGQC S AFVQLWAGLQP I LCGN 


420 






! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 




nh 

UiJ 


Jul 


AS GAGGAAN GT GAGAVMGPNATAEEGAP S AAALAT P DT LQGQC S AFVQLWAGLQ P I LCGN 


420 


Qy 


421 


NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 


480 






1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 | | | | | | | | | || | | | || | | | | 1 1 1 1 1 1 1 




Db 


*± j. 


NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETF 


480 


Qy 


481 


AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 540 






1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 II 1 1 1 1 II 1 1 




Db 


4 ft 1 


AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 


540 


Qy 


541 


LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 


600 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


S4 1 


LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 


600 


Qy 


601 


NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 


660 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I I I I I I I I I I 




Db 


601 


NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 


660 


Qy 


661 


WIQDMMERAIIDTFVGHDVVEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 


720 






1.1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 I I I I I I I I I I I 




Db 


661 


WIQDMMERAIIDTFVGHDWEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 


Qy 


721 


AMT I QH I VAEKEHRLKE VMKTMGLNNAVHWVAWFI TG FVQLS I S VTALTAI LKYGQVLMH 


7 80 






1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 




Db 


721 


AMT IQHIVAEKEHRLKEV^KTMGLNNAVllWVAWFITGFVQLS I SVTALTAI LKYGQVLMH 


780 



781 SHWIIWLFLAVYAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 84 0 

I M M I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I 

781 SHWIIWLFIAWAVATIMFCFLVSVLYSKAKLASACGGIIYFLSYVPYMYVAIREEVAH 840 
841 DKI TAFEKC I AS LMSTTAFGLGS KYFAL YEVAGVGI QWHT FSQS PVEGDDFNLLLAVTML 900 

I I I I I N I I I I I I I I I I I I I I I I I I I I I I LI | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 

841 DKI TAFEKC IAS LMSTTAFGLGS KYFAL YEVAGVGI QWHT FSQS PVEGDDFNLLLAVTML 900 

901 MVDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 
I I M I I I I I I I I I I I I I I I | | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I | | I I 
v 901 MVDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSV 960 

961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | M I I I I I I I I 
961 MEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQV 1020 

1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1021 VSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRL 108 0 

1081 TVEEHLWFYSRLKSMAQEEI RREMDKMI EDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

I I I M I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I | | | | | | | | I I I I I I I I |j| I I I I I I I 
1081 TVEEHLWFYSRLKSMAQEEI RREMDKMI EDLELSNKRHSLVQTLSGGMKRKLSVAIAFVG 114 0 

1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

12 01 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1201 KCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRK 1260 

12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I 

12 61 HVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFL 132 0 

1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVG 138 0 

M M I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 KVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNL7VRCSELTQSQASLQSASSVG 1380 
1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGWLKVRQF 144 0 

I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1381 SARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVKAEALSRVGQGSRKLDGGWLKVRQF 144 0 

1441 HGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1441 HGLLVKRFHCARRNSK7VLFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQ 1500 

1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1501 PRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLS 1560 

1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTA 162 0 

I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1561 SGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPS PAPS DS PAS PDEDLQAWNVSLPPTA 1620 



I 



Qy 


1621 


G P EMWT S AP S L P RLVRE P VRC T C S AQGT G FS C P S S VGGH P PQMRWT GDILTDIT GHNVS 1680 

N 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 I | | 1 1 | 1 1 | | | | | | | | | | | | | | | | | | | | | | M 1 1 | 

GPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVS 1680 


Db 


1621 


Qy 


1681 


EYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSM 17 40 

1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 I M 1 II 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 

E YLLFTS DRFRLHRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKGYHSM 1740 


Db 


1681 


Qy 


1741 


PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 
1 1 II 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 II II I I I I I I I || 
PTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQGTDWIAIFII 1800 


Db 


1741 


Qy 


1801 


VAMS FVPAS FWFLVAEK S T KAKH LQ FVS GCN P 1 1 YWLAN YVWDMLN Y L VP AT CCVI I L F 1860 

1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

VAMSFVPASFWFLVAEKSTKAKHLQFVSGCNPIIYWI^YVWDMLNYLVPATCCVIILF 1860 


Db 


1801 


Qy 


1861 


VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 

IN 1 1 II M 1 1 1 1 1 II 1 1 1 I I I I I I I I | | || | | | | | | || | | | | | | | | | | | | | | | | . 

VFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITAT 1920 


Db 


1861 


Qy 


1921 


VATFLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 1980 
1 1 1 1 1 N M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 | | | | | | | | | | 
VATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKM 198 0 


Db 


1921 


Qy 


1981 


KS P FEWDI VTRGLVAMAVEGWGFLLT IMCQYN FLRRPQRMPVSTKPVEDDVDVAS ERQR 204 0 

i m i ri 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

KSPFEWDI VTRGLVAMAVEGWGFLLT IMCQYNFLRRPQRMPVSTKPVEDDVDVASERQR 204 0 


Db 


1981 


Qy 


2041 


VLRGDADNDMVKIENLTKVYKSRKIGRIIAVDRLCLGVRPGECFGLLGVNGAGKTSTFKM 2100 
1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 
VLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRLGECFGLLGVNGAGKTSTFKM 2100 


Db 


2041 


Qy 


2101 


LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAXFDELTAREHLQLYTRLRGISW 2160 
1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 | I | | | | | | | | | | 
LTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISW 2160 


Db 


2101 


Qy 

Db 


2161 
2161 


KDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 222 0 
1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 M 1 1 1 1 II 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | | | | | 
KDEARVVKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPK 2220 


Qy 


2221 


ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 
1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 I I I | | | || | | 
ARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDG 2280 


Db 


2221 


Qy 


2281 


YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 
1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 I I I I | | | | | | | | | | | | | | | | | | | | | | | f| | | | | | | | | | | | | | 

YMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSG 2340 


Db 


2281 


Qy 


2341 


VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPSALQSPLGGLLSLLRPRSAPTEL 2400 
1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l l 

VLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQETEPPS7VLQSPLGCLLSLLRPRSAPTEL 2400 


Db 


2341 


Qy 


2401 


RALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 
RALVADEPEDLDTEDEGLI SFEEERAQLSFNTDTLC 2436 


Db 


2401 



RESULT 2 
ABC2_M0USE 

ID ABC2_M0USE STANDARD; PRT; 2434 AA. 

AC P41234; 

DT Ol-FEB-1995 (Rel. 31, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE ATP-binding cassette, sub-family A, member 2 (ATP-binding cassette 

DE transporter 2) (ATP-binding cassette 2). 

GN ABCA2 OR ABC2 . 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 
RN [1] 

RP SEQUENCE FROM N.A. , AND REVISIONS. 

RC STRAIN=DBA/2; 

RA Chimini G. ; 

RL Submitted (DEC-2000) to the EMBL/ GenBank/DDB J databases. 
RN [2] 

RP SEQUENCE OF 964-2434 FROM N.A. 

RC STRAIN-DBA/2; TISSUE=Macrophage ; 

RX MEDLINE=94375008; PubMed-8088782 ; 

RA Luciani M.F., Denizot F., Savary S., Mattei M.-G., Chimini G.; 

RT "Cloning of two novel ABC transporters mapping on human chromosome 

RT 9 . " ; 

RL Genomics 21:150-159(1994). 

CC FUNCTION: Probable transporter, its natural substrate has not been 

CC found yet. May have a role in macrophage lipid metabolism and 

CC neural development . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

CC -!- TISSUE SPECIFICITY: Widely expressed in adult tissues. Highest 
CC levels are found in brain and pregnant uterus. 

CC -!- SIMILARITY: Belongs to the ABC transporter family. ABC A subfamily. 

CC , 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

cc 

DR EMBL; X75927; CAA53531.2; -. 

DR MGD; MGI: 99606; Abca2 . 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC^transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; AB CJT RAN S PORT ER_1 ; 1. 

DR PROSITE; PS50893; ABC_T RAN S PORT ER_2 ; 2. 

KW , ATP-binding; Transport; Transmembrane; Repeat; Glycoprotein. 

FT TRANSMEM 21 40 POTENTIAL. 

FT TRANSMEM 705 727 POTENTIAL. 

FT TRANSMEM 74 8 770 POTENTIAL. 

FT TRANSMEM 780 802 POTENTIAL. 



FT 


TRANSMEM 


809 


831 


POTENTIAL 






FT 


TRANSMEM 


1793 


1815 


•POTENTIAL 






FT 


TRANSMEM 


1846 


1865 


POTENTIAL. 




FT 


TRANSMEM 


1875 


1897 


POTENTIAL 






FT 


TRANSMEM 


1904 


1926 


POTENTIAL, 






FT 


NP BIND 


1024 


1031 


ATP (POTENTIAL) . 




FT 


NP BIND 


2088 


2 095 


ATP (POTENTIAL) . 




FT 


CARBOHYD 


14 


14 


T\T — T TMWffn 


[ (jI»UNAC . 


\ ( Dr^ r PT?M r PT AT \ 


FT 


CARBOHYD 


89 


89 


N — JjINr\.fcj.U 


/ /"* T /->xt 7\ r* 

1 bLCNAC . 


\ / Dr^ r Pt7'KT r PT Z\T ^ 

. ; \ rUl CjIn 1 x/\J_i ) . 


FT 


CARBOHYD 


168 


168 


N~ LINKED 


, GLCNAC . 


\ / D/^T 1 T-PXTT 1 T AT \ 

. ) ( r Ul CjN i 1Pl±i ) . 


FT 


CARBOHYD 


173 


173 


xt t TMvrn 
JM — IiIN J\IiD 


f r*T r^xT 7\ 
, GLCNAC . 




FT 


CARBOHYD 


305 




xt t TMVirn 
N LINKED 


/ /-» T f»XT 7\ O 


. ; \ rUl EN i 1AL ; . 


FT 


CARBOHYD 


368 


368 


xt T TXTVlTT^ 

N— IiINKIjIJ 


bLLNAC . 


\ / DnTTMT 1 TAT \ 


FT 


CARBOHYD 


379 




XT T TKtl/L' r> 


GLCNAC . . 


. ) \ fUl CilNJ 1 1AL ) . 


FT 


CARBOHYD 

viU\ LJ \J 111 U 


420 


M <1 u 


XT T T XT fTT 1 T\ 

N— LINKED 


GLCNAC . . 




FT 


CARBOHYD 


432 


432 


XT T T XT Ty^TT 1 T\ 

N-LINKED 


GLCNAC . . 


\ / "n /~\ rn T7 1 XT T» T A T \ 

. ; \ rUl CjN 1 1AL ) . 


FT 


CARROHYD 

\-»iif\lJWll X. u 


476 

i t \J 


476 


XT T TkTvun 
N - LINKED 


GLCNAC . . 


. ; { rUl t>N 1 IAL) . 


FT 


CARBOHYD 

v^ru\L»wii J. LJ 


484 


4 ft 4 

*i L> *i 


XT T TXTT/"CP» 

N— LINKED 


GLCNAC . 




FT 


CARROHYD 

vJTJ\UVU X 1_^ 


4 94 


4 Q4 


XT T T>TT/r>n 

N-LINKED 


GLCNAC . . 


. ; [ rUl tJM 1 IAL ) . 


FT 


CARROHYD 


S^O 

J J u 


s^o 

J 0 u 


XT T X X T Ty T7* T~\ 

N-LINKED 


GLCNAC . . 


\ / n /~\ rp TP XTm TAT \ 

. ) \ POJ. EN 1 IAL) . 


FT 


CARROHYD 

V^jT i.1 \1_J Vw/ 1 1 X LJ 


54 8 


^4 ft 
j 1 0 


N-LINKED 


GLCNAC . . 


\ / D/^ r PT?XTrp T A T \ 

. ; [ rUl I IAL ) . 


FT 


CARROHYD 


Sft 9 
j (j j 


Sft Q 

J O _7 


N-LINKED { 


GLCNAC . . 


. ) \ rUl t,N 1 IAL ) . 


FT 


CARROHYD 

V'ilTMJWll X LJ 


~J Zs J 


O -/ -7 


N-LINKED ( 


GLCNAC . . 


\ / T~> /~\rp rpxTm ~~r t\ t \ 

. ) ( FOl EN 1 IAL) . 


FT 


CARROHYD 

V-»*ki\XJvyi 1 X LJ 


627 


Uzl / 


N-LINKED ( 


GLCNAC . . 


\ / "Ti/^rp T7 , XT r P X 7\ T \ 

. ; [ ifOL EN i IAL ) . 


FT 


CARROHYD 

vAI\ LJ W 1 1 X LJ 


14 0ft 

J. 1 \J O 


X H \J O 


N-LINKED ( 


GLCNAC . . 


. J ( rU 1 EN 1 IAL ) . 


FT 


CARROHYD 

\^JiXA.XJ\_/H X LJ 


1 4 Qfi 

X M -7 D 


1 4 Qfi 
x *± z) 0 


N-LINKED ( 


GLCNAC. . 


. ; (FOIEN1IAL) . 


FT 


CARBOHYD 

^rUMJvi 1 X LJ 


154 9 


1 S4 Q 


N-LINKED ( 


GLCNAC . . 


. ) \ FUI EN 1 IAL ) . 


FT 


CARBOHYD 


1557 


1557 


N-LINKED ( 


GLCNAC. . 




FT 


CARBOHYD 


1613 


1613 


N-LINKED ( 


GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


1678 


1678 


N-LINKED ( 


GLCNAC . . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


1776 


1776 


N-LINKED ( 


GLCNAC . . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


2055 


2055 


N-LINKED ( 


GLCNAC . . 


. ) (POTENTIAL) . 


SQ 


SEQUENCE 


2434 


AA; 270582 


MW; 3CEDD4 8ED5692005 


CRC64; 



Query Match 89.6%; Score 11349; DB 1; Length 2434; 

Best Local Similarity 90.6%; Pred. No. 0; 

Matches 2217; Conservative 51; Mismatches 155; Indels 24; Gaps 9 

MGFLHQLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I 

MGFLHQLQLLLWKNvTLKRRSPWVTAFEIFIPLVLFFILLGLRQKKPTISVKEA- FYTAA 59 

PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLDRVVEEGNLFDPARPSLGSE 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
PLTSAGILPVMQSLCPDGQRDEFGFLQYANSTVTQLLERLHRWEEGNLFDPVRPSLGSE 119 

LEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLPNSTAQAL 180 
I I I I I I Mllhllll II I I I I I I I I I I I I : : I I I I I I I I I I I I I I I I I I I 
LEALRQRLEALSSGPGTWESHSARPAVSSFSLDSV7VRDQRELWRFLMQNLSLPNSTAQAL 179 

xxAARVT)PPEWHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPALLEQLTC 240 
I I I I I I I I I I I I I I I I I : I : I I I I I I I I I I I : I I I I I I I I I I I I I I I I 
LAARVDPSEVYRLLFGPLPDLDGKLGFLRKQEPWSRLGSNPLLQMEELLLAPALLEQLTC 239 

TPGSGELGRILTVPESQKGALQGYRDAVCSGQAA7VRARRFSGLSAELRNQLDVAKVSQQL 300 

II I II I I I I I I : I I : I I I I I I II I I I I I llhlll I : I I U I I I I I I :: I I I 
APGSGELGRILTMPEGHQVDLQGYRDAVCSGQATARAQRFSDLAAELRNQLDTAKIAQQL 299 



Qy 


1 


Db 


1 


Qy 


61 


Db 


60 


Qy 


121 


Db 


120 


Qy 


181 


Db 


180 


Qy 


241 


Db 


240 



301 GLDAPNGSDSSPQAPPPRRLQALLGDLLDAQKVLQDVDVLSAIJ\LLLPQGACTGRTPGPP 360 

I I I I I I I I I I I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I : I 

300 GFDVPNGSDPQPQAPSPQSLPALLGDLLDAQKLLQDVDVLSALALLLPQGACAGQASAPQ 359 

361 AS GAGGAAN GT GAGAVMG PNAT AE EGAP S AAALAT P DT LQ GQ C S AFVQ LWAG LQ P I LC GN 420 

II I I I I I I I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I I I I 

360 ASSLNGLANSTGIGANSGSNTTVEEGTQSPVS PASPDTLQGQCSAFVQLWAGLQPILCGN 419 

421 NRT I EP EALRRGNMS S LGFT S KEQRNLGLLVHLMTSN PKI L YAP AGS EVDRVI LKANET F 480 

I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I Ml I I I I I I I I I I I 
420 NRTIEPEALRRGNMSSLGFTSKEQRNLGLLVHLMTSNPKILYAPVGSEADRVILKANETF 479 

481 AFVGNVTHYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPA 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I : I : I I I I I : I I I I : I II I I 
48 0 AFVGNVTHYAQVWLNISTEIRSFLEQGRLQQHLQWLQQYVADLQLHPEAMNLSLEELPPA 539 

541 LRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 600 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | 
540 LRQD-FSLPNGTALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQD 598 

601 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGF 660 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

599 NVTVFASVI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRFYFLYGL 658 

661 WIQDMMERAIIDTFVGHDVVEPGSWQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSV 720 

I I I I I I I I I : I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
659 RLDQDMMERAIINTFVGHDWEPGNWQMFPYPCYTRDDFLFVIEH>IMPLC^ISWVYSV 718 

721 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISWALTAILKYGQVTJ^H 780 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
719 AMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYCQVLMH 77 8 

781 SHWIIWLFIAWAVATIMFCFLVSVLYSKAKIASACGGIIYFLSYVPYMYVAIREEVAH 84 0 

I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
779 SHVLIIWLFIAVYAVATIMFCFLVSVLYSKAKl^SACGGIIYFLSWPYMYVAIREEVAH 838 

841 DKITAFEKCIASLMSTTAFG-LGSKYFALYEVAGVGIQWHTFSQSPVEGDDFNLLLAVTM 899 

I I II I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I I I 

839 DKI TAFEKC IAS RC PQQPLAWVP STLHCMKWQEWAS I QWHTFSQS PVEGDDFNLLLAVTM 898 

900 LMVDAWYGILTWYIEAVHPGMYGLPRPWYFPLQKSY-WL GSGRTEAWEWSW 950 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I : I I : I I 

899 LMVDTWYGVLTW Y I EAVH P GMYGL P RPW YS R YRS P I GWAVGGQKP GS GAGHGHTH RAS A 958 

951 PWARTPRLSVMEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNK 1010 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I 

959 LWRRI QACAMESRHFEETRGMEEEPTHLPLWCVDKLTKVYKNDKKLALNK 1009 

1011 LSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMC 1070 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1010 LSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMC 1069 

1071 PQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKR 1130 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1070 PQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRKETDKMIEDLELSNKRHSLVQTLSGGMKR 1129 
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1131 


Db 


1130 


Qy x 


1191 


Db 


1190 


Qy 


1251 


Db 


1250 


Qy 


1311 


Db 


1310 


Qy 


1371 


Db 


1370 


Qy 


1431 


Db 


1430 


Qy 


. 1491 


Db 


1490 


Qy 


1551 


Db 


1550 


Qy 


1610 


Db 


1610 


Qy 


1670 


Db 


1670 


Qy 


1730 


Db 


1730 


Qy 


.1790 


Db 


1790 


Qy 


1850 


Db 


1850 


Qy 


1910 


Db 


1910 


Qy 


1970 



KLSVAIAFVGGSRAI ILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGD 1190 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

KLSVAI AFVGGS RAI I LDEPTAGVDP YARRAIWDLI LKYKPGRT I LLSTHHMDEADLLGD 1189 

RIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCS 1250 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I | | | | I I I I I M II I Mill 
RIAIISHGKLKCCGSPLFLKGAYXDGYRLTLVKQPAEPGTSQEPGLASSPSGCPRLSSCS 124 9 

ELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGL 1310 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I II I 
EPQVSQFIRKHVASSLLVSDTSTELSYILPSEAVKKGAFERLFQQLEHSLDALHLSSFGL 1309 

MDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQ 1370 

I I > I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I II I I I I I I I I III 

MDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGLTAVGGQAGNLARCSELAQSQ 1369 

ASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKL 1430 
I I I I I I I I I I I I I I: I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I : : I I I I I I I I 
ASLQSASSVGSARGEEGTGYSDGYGDYRPLFDNLQDPDNVSLQEAEMEALAQVGQGSRKL 1429 

DGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVL 14 90 
: I I I I : I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
EGWWLKMRQFHGLLVKRFHCARRNSKALCSQILLPAFFVCVAMTVALSVPEIGDLPPLVL 14 8 9 

SPSQYHNYTQPRGNFIPYANEERREYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PAN 1550 
I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
SPSQYHNYTQPRGNFIPYANEERQEYRLRLS PDAS PQQLVSTFRLPSGVGATCVLKS PAN 1549 

GSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPAS PDED-L 1609 
Mill I I II I I I II I I I II II II I I I I I I I II II II I II I I I I I I II I I I I I I I I I 
GSLGPMLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPVXPDEDSL 1609 

QAWNVS L P P TAG P EMWT SAP S L P RLVRE P VRCT C S AQGT G FS C P S S VGGH P PQMRWT GD 1669 
M I I : I I II I I I I I I I I I I II II II II I II I I I I I II II I II II II I I I I I II I I I I I 
QAWNMSLPPTAGPETWTSAPSLPRLVHEPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGD 1669 

ILTDITGHNVSEYLLFTSDRFRLHRYGAITFGNVLKSIPASFGTRAP PMVRKIAVRRAAQ 172 9 
I I I II I II I I I I I I I II I I II II I II I I II II I I I I I I II I I I II I I II I I I I I II 
I LTDITGHNVSEYLLFT SDRFRLHRYGAI TFGNVQKS I PAS FGARVP PMVRKI AVRRVAQ 1729 

VFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQ 1789 

I M M M M I II II I I I I I I II M M I I M I M M M II I I I I I I I M I I I M I I I II I 

VLYNNKGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQ 1789 

GTDWIAIFIIVAMSFVPAS FWFLVAEKSTKAKHLQFVSGCNPII YWLANYWDMLNYL 184 9 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | : | | | | | | | | | | | | | | | 
GTDWIAI FI I VAMS FVPAS FWFLVAEKSTKAKHLQFVSGCNPVI YWLANYVWDMLNYL 1849 

VPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLI 1909 
M I I I I I I I I I I M II II I I I I II I I I II I II II II I II I || || I I I I I I I I I I II I I II , 
VPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLI 1909 

VINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLMEMAYNEYINE 1969 
I M II I I I I I I I II I II I I II I II I I I I I I II I I I I I II I II I I I I I I I I I I I I I I I I || 
VINLFIGITATVATFLLQLFEHDKDLKVVNSYLKSCFLIFPNYNLGHGLMEMAYNEYINE 1969 

1970 YYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVE 2029 



I I I I I I I I I I I I I I I I I I I I I I I I I I I III III I I I I I I I I I I I : I I I : I I I I I I I I 

Db 1970 YYAKIGQFDKMKSPFEWDIVTRGLVAMTVEGFVGFFLTIMCQYNFLRQPQRLPVSTKPVE 2029 

Qy 2 030 DDVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGV 208 9 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 . 

Db 2030 DDVDVAS ERQRVLRGDADNDMVKI ENLTKVYKS RKI GRI LAVDRLCLGVRPGEC FGLLGV 2089 

Qy 2090 NGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAi^EHL 214 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I : I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 2090 NGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKDLLQVQQSLGYCPQFDALFDELTAREHL 2149 

Qy 2150 QLYTRLRGISWKDETVRWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIF 2209 

I I I I I I I M I I I II : 1 I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I I I I I I II I I 
Db 2150 QLYTRLRGIPWKDEAQWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIF 2209 

Qy 2210 LDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGS 2269 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2210 LDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGS 2269 

Qy 2270 IQHLKNRFGDGYMITVRTKSSQSVKDWRFFNRNFPEAMLKERHHT'KVQYQLKSEHISLA 2329 

I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2270 IQHLKNRFGDGYMITVRTKSSQNWDVVRFFNR^ 2329 

Qy 2330 QVFSKMEQVSGVLGIEDYSVSQTTLDNVFWFAKKQSDNLEQQETEPPSALQSPLGCLLS 2389 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I : I I I I I III 
Db 2330 QVFSKMEQWGVLGIEDYSVSQTTLDNVFVNFAKKQSDNVEQQEAE-PSSLPSPLG-LLS 2387 

Qy 2390 LLRPRSAPTELRALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2436 

Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2388 LLRPRPAPTELRALVADEPEDLDTEDEGLISFEEERAQLSFNTDTLC 2434 



RESULT 3 
ABC1 HUMAN 



ID ABC INHUMAN STANDARD; PRT; 2261 AA. 

AC 095477; Q96S56; Q96T85; Q9NQV4; Q9UN06; Q9UN07; Q9UN08; Q9UN09; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE ATP-binding cassette, sub-family A, member 1 (ATP-binding cassette 

DE transporter 1) (ATP-binding cassette 1) (ABC-1) (Cholesterol efflux 

DE regulatory protein) . 

GN ABCA1 OR ABC1 OR CERP . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBIJTaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20345099; PubMed=10884428 ; 

RA Santamarina-Fojo S., Peterson K.M., Knapper C.L., Qiu Y. , 

RA Freeman L.A., Cheng J.-F., Osorio J., Remaley A.T., Yang X.-P., 

RA Haudenschild C.C., Prades C, Chimini G., Blackmon E.E., 

RA Francois T.L., Duverger N . , Rubin E.M., Rosier M. , Denefle P., 

RA Fredrickson D.S., Brewer H.B. Jr.; 

RT "Complete genomic sequence of the human ABCA1 gene: analysis of the 

RT human and mouse ATP-binding cassette A promoter."; 



RL Proc. Natl. Acad. Sci. U.S.A. 97:7 987-7992(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Skin; 

RA Schwartz K., Lawn R.M., Wade D.P.; 

RT "ABCA1 gene expression and apoA-I-mediated cholesterol efflux are 

RT regulated by LXR. " ; 

RL Submitted (JUL-2000) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21251004; PubMed=11352567 ; 

RA Qiu Y., Cavelier L. , Chiu S., Yang X., Rubin E . , Cheng J.-F.; 

RT "Human and mouse ABCA1 comparative sequencing and transgenesis 

RT studies revealing novel regulatory sequences."; 

RL Genomics 73:66-76(2001). 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Tanaka A. R. , Abe-Dohmae S., Arakawa R. , Sadanami K . , Kidera A. , 

RA Kioka N . , Amachi T., Yokoyama S., Ueda K. ; 

RT "A new topological model of functional human ABCAl-signal peptide 

RT cleavage and glycosylation of a large extracellular domain."; 

RL Submitted (FEB-2001) to the EMBL/GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE OF 21-2261 FROM N.A. 

RX MEDLINE=99194549; PubMed=10092505 ; 

RA Langmann T., Klucken J. , Reil M. , Liebisch G., Luciani M.F., 

RA Chimini G., Kaminski W.E., Schmitz G. ; 

RT "Molecular cloning of the human ATP-binding cassette transporter 1 

RT (hABCl) : evidence for sterol-dependent regulation in macrophages."; 

RL Biochem. Biophys. Res. Commun. 257:29-33(1999). 

RN [6] 

RP SEQUENCE OF 21-22 61 FROM N.A. 

RX MEDLINE=99364413; PubMed=10431238 ; 

RA Rust S., Rosier M. , Funke H. , Real J. , Amoura Z., Piette J.-C , 

RA Deleuze J.-F., Brewer H.B., Duverger N., Denefle P., Assmann G. ; 

RT "Tangier disease is caused by mutations in the gene encoding 

RT ATP-binding cassette transporter 1."; 

RL Nat. Genet. 22:352-355(1999). 

RN [7] 

RP PHOSPHORYLATION OF SER-1042 AND SER-2054 . 

RX MEDLINE=22289331; PubMed-121 96520 ; 

RA See R.H., Caday-Malcolm R.A. , Singaraja R.R., Zhou S., Silverston A. , 

RA Huber M.T., Moran J. , James E.R., Janoo R., Savill J.M. , Rigot V. , 

RA Zhang L.H., Wang M. , Chimini G., Wellington C.L., Tafuri S.R., 

RA Hayden M.R. ; 

RT "Protein kinase A site-specific phosphorylation regulates ATP-binding 

RT cassette Al (ABCA1) -mediated phospholipid efflux."; 

RL J. Biol. Chem. 277:41835-41842(2002). 

RN [8] 

RP VARIANTS HDLD2 THR-1091 AND 1893-GLU-ASP-1894 DEL. 

RX MEDLINE=20001430; PubMed=10533863 ; 

RA Marcil M. , Brooks-Wilson A- , Clee S.M., Roomp -K., Zhang L.-H., Yu L. , 

RA Collins J. A., van Dam M., Molhuizen H.O.F., Loubser O., 

RA Ouelette B.F.F., Sensen C.W., Fichter K. , Mott S., Denis M. , 

RA Boucher B., Pimstone S., Genest J. Jr., Kastelein J. J. P., Hayden M.R.; 

RT "Mutations in the ABC1 gene in familial HDL deficiency with defective 

RT cholesterol efflux."; 



RL Lancet 354:1341-134 6(1999). 

RN [9] 

RP VARIANTS HDLD1 ARG-597 AND ARG-1477, AND VARIANT HDLD2 LEU-693 DEL. 

RX MEDLINE=99364411; PubMed=10431236; 

RA Brooks-Wilson A. , Marcil M. , Clee S.M., Zhang L.-H., Roomp K., 

RA van Dam M., Yu L., Brewer C, Collins J. A., Molhuizen H.O.F., 

RA Loubser 0., Ouelette B.F.F., Fichter K. , Ashbourne-Excof f on K.J.D., 

RA Sensen C.W., Scherer S., Mott S., Denis M. , Martindale D., 

RA Frohlich J., Morgan K., Koop B., Pimstone S., Kastelein J. J. P., 

RA Hayden M. R. ; 

RT "Mutations in ABC1 in Tangier disease and familial high-density 

RT lipoprotein deficiency."; 

RL Nat. Genet. 22:336-34 5(1999). 

RN [10] 

RP VARIANTS HDLD1 SER-590; SER-935 AND VAL-937, AND VARIANTS ALA- 3 99 AND 

RP MET-883. 

RX MEDLINE-99364412; PubMed-10431237 ; 

RA Bodzioch M. , Orso E. , Klucken J., Langmann T-, Boettcher A., 

RA Diederich W., Drobnik W., Barlage S., Buechler C, 

RA Porsch-Oezcueruemez M. , Kaminski W.E., Hahmann H.W., Oette K., 

RA Rothe G., Aslanidis C, Lackner K.J., Schmitz G. ; 

RT "The gene encoding ATP-binding cassette transporter 1 is mutated in 

RT Tangier disease."; 

RL Nat. Genet. 22:347-351(1999). 

RN [11] 

RP VARIANTS HDLD1 ILE-929; ARG-597 AND ARG-1477, AND VARIANTS HDLD2 

RP LEU-693 DEL; THR-1091; 1893-GLU-ASP-1894 DEL AND LEU-2150. 

RX MEDLINE=20540002; PubMed=11086027 ; 

RA Clee S.M., Kastelein J. J. P., van Dam M., Marcil M. , Roomp K., 

RA Zwarts K.Y., Collins J. A. , Roelants R. , Tamasawa N., Stulc T.«, 

RA Suda T. f Ceska R. , Boucher B. f Rondeau C. r DeSouich C, 

RA Brooks-Wilson A., Molhuizen H.O.F., Frohlich J., Genest J. Jr., 

RA Hayden M. R. ; 

RT "Age and residual cholesterol efflux affect HDL cholesterol levels and 

RT coronary artery disease in ABCA1 heterozygotes . " ; 

RL J. Clin. Invest. 106:1263-1270(2000). 

RN [12] 

RP VARIANTS HDLD1 ASN-1289 AND HIS-1800. 

RX MEDLINE=20171564; PubMed=10706591 ; 

RA Brousseau M.E., Schaefer E.J., Dupuis J., Eustace B., 

RA Van Eerdewegh P., Goldkamp A.L., Thurston L.M., FitzGerald M.G., 

RA Yasek-McKenna D., O'Neill G., Eberhart G.P., Weiffenbach B. , 

RA Ordovas J.M., Freeman M.W., Brown R.H. Jr., Gu J.Z.; 

RT "Novel mutations in the gene encoding ATP-binding cassette 1 in four 

RT tangier disease kindreds."; 

RL J. Lipid Res. 41:433-441(2000). 

RN [13] 

RP VARIANT HDLD1 ASP-1046, VARIANT HDLD2 CYS-230, AND VARIANTS LYS-219; 

RP ILE-825; MET-883 AND LYS-1587. 

RX MEDLINE=20396633; PubMed=1093802 1 ; 

RA Wang J., Burnett J. R., Near S., Young K . , Zinman B., Hanley A.J.G., 

RA Connelly P.W., Harris S.B., Hegele R.A. ; 

RT "Common and rare ABCA1 variants affecting plasma HDL cholesterol."; 

RL Arterioscler . Thromb. Vase. Biol. 20:1983-1989(2000). 

RN [14] J . 

RP VARIANT HDLD1 TRP-587, AND VARIANT LEU-2168. 

RX MEDLINE-21157002; PubMed=11257260 ; 



RA Bertolini S., Pisciotta L., Seri M. , Cusano R., Cantafora A., 

RA Calabresi L., Franceschini G., Ravazzolo R. , Calandra S.; 

RT "A point mutation in ABCl gene in a patient with severe premature 

RT coronary heart disease and mild clinical phenotype of Tangier 

RT disease."; 

RL Atherosclerosis 154:599-605(2001). 

RN [15] 

RP VARIANTS LYS-219; MET-883 AND ASP-1172. 

RX MEDLINE=21157003; PubMed=11257261 ; 

RA Brousseau M.E., Bodzioch M. , Schaefer E.J., Goldkamp A.L., Kielar D., 

RA Probst M., Ordovas J.M., Aslanidis C, Lackner K.J., 

RA Bloomfield Rubins H., Collins D., Robins S.J., Wilson P . W . F. , 

RA Schmitz G. ; 

RT "Common variants in the gene encoding ATP-binding cassette transporter 

RT 1 in men with low HDL cholesterol levels and coronary heart disease."; 

RL Atherosclerosis 154:607-611(2001). 

RN [16] 

RP VARIANT HDLD1 LEU-1506. 

RX MEDLINE=21369429; PubMed=11476961 ; 

RA Lapicka-Bodzioch K. , Bodzioch M. , Kruell M. , Kielar D., Probst M. , 

RA Kiec B., Andrikovics H., Boettcher A., Hubacek J., Aslanidis C, 

RA Suttorp N., Schmitz G. ; 

RT "Homogeneous assay based on 52 primer sets to scan for mutations of 

RT the ABCA1 gene and its application in genetic analysis of a new 

RT patient with familial high-density lipoprotein deficiency syndrome."; 

RL Biochim. Biophys . Acta 1537:42-48(2001). 

RN [17] 

RP VARIANTS HDLDl ASN-1289 AND TRP-2081, AND VARIANT LYS-219. 

RX MEDLINE=21369433; PubMed=11476965; 

RA Huang W., Moriyama K. , Koga T., Hua H., Ageta M. , Kawabata S., 

RA Mawatari K., Imamura T. , Eto T., Kawamura M., Teramoto T., Sasaki J. ; 

RT "Novel mutations in ABCA1 gene in Japanese patients with Tangier 

RT disease and familial high density lipoprotein deficiency with 

RT coronary heart disease."; 

RL Biochim. Biophys. Acta 1537:71-78(2001). 

RN [18] 

RP VARIANTS LYS-219; ALA-399; MET-771; PRO-774; ASN-776; ILE-825; 

RP MET-883; ASP-1172; LYS-1587 AND CYS-1731. 

RX MEDLINE=21138379; PubMed=112 382 61 ; 

RA Clee S.M., Zwinderman A.H., Engert J.C., Zwarts K. Y. , 

RA Molhuizen H.O.F., Roomp K., Jukema J.W. f van Wijland M. r van Dam M. , 

RA Hudson T.J., Brooks-Wilson A., Genest J. Jr., Kastelein J. J. P., 

RA Hayden M.R. ; 

RT "Common genetic variation in ABCA1 is associated with altered 

RT lipoprotein levels and a modified risk for coronary artery disease."; 

RL Circulation 103:1198-1205(2001). 

RN [19] 

RP VARIANT HDLDl THR-255, AND VARIANT ATHEROSCLEROSIS ASP-1611. 

RX MEDLINE=21645894; PubMed=117 85958 ; 

RA Nishida Y. , Hirano K., Tsukamoto K., Nagano M. , Ikegami C, Roomp K., 

RA Ishihara M. , Sakane N., Zhang Z., Tsujii K. , Matsuyama A. , Ohama T., 

RA Matsuura F., Ishigami M. , Sakai N. f Hiraoka H., Hattori H., 

RA Wellington C, Yoshida Y. , Misugi S., Hayden M.R., Egashira T., 

RA Yamashita S. A Matsuzawa Y.; 

RT "Expression and functional analyses of novel mutations of ATP-binding 

RT cassette transporter-1 in Japanese patients with high-density 

RT lipoprotein deficiency."; 



RL Biochem. Biophys. Res. Commun. 290:713-721(2002). 

RN [20] 

RP VARIANTS LYS-219; MET-771; ILE-825; MET-883; ASP-1172; PHE-1181 AND 

RP LYS-1587. 

RX MEDLINE=2'2932833; PubMed-12966036; 

RA Morabia A. , Cayanis E., Costanza M.C., Ross B.M., Flaherty M.S., 

Query Match 33.4%; Score 4233.5; DB 1; Length 2261; 

Best Local Similarity 39.8%; Pred. No. 4.6e-249; 

Matches 999; Conservative 345; Mismatches 731; Indels 435; Gaps 61; 

Qy 6 QLQLLLWKNVTLKRRSPWVLAFEIFIPLVLFFILLGLRQKKPTISVKEVPFYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLFDPARP 115 

I I I : I : : I : : I I : I I : : I : 
Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVARLFSDARRLL LYSQKDT 120 

Qy 116 SLGSELEALR— QHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSLP 173 

I : : I I | : : | : : I I I I I I I I I I 

Db 121 SMKDMRKVLRTLQQI KKS S SNLKLQDFLVDNET FS G- FLYHNLSLP 165 

Qy 174 NSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEELLLAPA 233 

II : I I I : I : I I III : I I :: 
Db 166 KSTVDKMLRADV ILHKVFLQGYQLHLTS-LCNGS KSEEMI 204 

Qy 234 LLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELRNQLDV 293 

II | | : : : | : | : I I : : I : 

Db 205 QL GDQEVSELCGLPREKLAAAE RVLRSNMDI 235 

Qy 294 AK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLD AQKVLQDVDVLS 341 

I : : I : I I : I : I III I : I : : I I 

Db 236 LKPILRTLNSTSPFPSKELAEA — TKTLLHSLGTLAQELFSMRSWSDMRQEVMFLTNVNS 293 

Qy 342 ALALLL PQGACTGRT P GP PAS GAGGAAN GTGAGAVMG PNAT AEEGAP S AAALAT P 396 

: : I : III : I : I I I I : II 

Db 294 S S SSTQI YQAVSRIVCGHPEGGGLKI KSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTP 353 



Qy 397 DTLQGQCSAFVQ — LWAGLQPI LCGNNRTI EPEALRRGNMS SLGFTS KEQRNLGLLV 451 

I : : I : : : I I : I : I I 
Db 354 YCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 452 HLMTSNPKI LYAPAGS EVDRVI LKANETFAFVGNVTHYAQVWLNI S AEI RS FLEQGRLQQ 511 

Mill : | : : | : | | : : | : | : | : | : | : 

Db 382 KILYTPDTPATRQVMAEWKTFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 434 

Qy 512 HLRWL , QQYVAELRLHPE— ALNLSLDELPPALRQDNFS 547 

: I I I II I III : I I: I : I ' 

Db 435 LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYTWREAFNETN— 492. 

Qy 548 LPS GMAL LQQ L DT I DNAAC GW I Q FMS KVS VD IFKGFPDEES I VN YT LNQ AYQ DN VT VFAS 607 

I : I I : | | | : : : : | , : : | : : | : | 

Db 493 QAIRTIS RFMECVNLNKLEPIATEVWLINKSME— LLDERKFWAG 535 

Qy 608 VI FQTRKDGS — LPPHVHYKI RQNSS FTEKTNEI RRAYWRPGPNTG GRFYFLYGFVW 662 

: : I II I I I I I I I I : I : I I : I : I I I I I I I I : 



Db 



536 I VFTGITPGS I ELPHHVKYKI RMDI DNVERTNKI KDGYWDPGPRADPFEDMRYVWGGFAY 595 



QY 663 IQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMMPLCMVISWVYSVAM 722 

s I I :: I : I I I I : : I I : I I I I I I I I I I : I I I I : : I : I I I I : 

Db 596 LQDWEQAIIRVLTGTE-KKTGVYMQQMPYPCWDDIFLRVMSRSMPLFMTLAWIYSVAV 654 

QY 723 TIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSISVTALTAILKYGQVLMHSH 782 

I : I I I I I I I I I I : I I I : I : : I : I I I : : I : I I I I I I : I : I 
Db 655 IIKGIVYEKEARLKETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWILKLGNLLPYSD 714 

Qy 783 WT I WL FLAVYAVAT I M FC FL VS VL Y S KAKLAS AC GG 1 1 Y FL S YVP YMYVAI RE E VAH D K 842 

: I I : I : I I I I : I I I : I I : I : I I I : I I I I I I I I I : I I : I I 

Db 715 PS WFVFLSVFAVOTI LQCFLI STLFSRANLAAACGGI I YFTLYLPYVLC VAWQD 769 

Qy 843 I TAFE- KC IAS LMSTTAFGLGS K YFALYEVAGVGI QWHT FSQS PVEGDDFNLLLAVTMLM 901 

I I I I I : I I I I I : I I I I : I I : I : I I : I I I I I I I I : I : I : : 
Db 770 YVGFTLKI FASLLS PVAFGFGCEYFALFEEQGI GVQWDNLFES PVEEDGFNLTTSVSMML 829 

Qy 902 VDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWEWSWPWARTPRLSVM 961 

I : I I : : M M I I I I I M : II I I I I I Mill I: I I I : |:| 
Db 830 FDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE ESDEKSHPGSNQKRIS — 884 

Qy 962 EEDQACiWESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLALNKLSLNLYENQW 1021 

: I I I I I I I I I I I : I I I I : I I : I : : I : I I I I ' I : 

Db 885 EIC MEEEPTHLKLGVSIQNLVKVYRDGMKVAVDGLALNFYEGQIT 929 

Qy 1022 SFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLT 1081 

I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I : I I I I : I I I : I I I I I I I I I | | 
Db 930 S FLGHNGAGKTTTMS I LT GL FP PT S GTAY I LGKD I RS EMS T I RQN LGVC PQHNVL FDMLT 989 

Qy 1082 VEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSGGMKRKLSVAIAFVG 1140 

I I I I : I I I : I I I :::::: I I :: I I : I hi I I I I I I : I I I I I I : I I I I 

Db 990 VEEHIWFYARLKGLSEKHVK7VEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLSVALAFVG 1049 

Qy 1141 GSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKL 1200 

lh : I I I I I I I I I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I : I I I I I I I I I I I I I 
Db 1050 GSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKL 1109 

Qy 1201 KCCGS PLFLKGTYGDGYRLTLVKRPAEPG GPQEPGLAS 1238 

I I I I I I I 11111111:1 : | | | 

Db 1110 CCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGS 1169 

Qy 1239 SPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLER 1298 

I : I Mill: III I I : I : I I I I I I : I I I II : : 
Db 1170 DHES DTLT I DVS - - AI SNLI RKHVS EARLVEDI GHELT YVLP YEAAKEGAFVELFHEI DD 1227 

Qy 1299 SLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAG 1358 

I I : I I : I : : I I I I I : I I I I : I I I I I : II 
Db 1228 RLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTLP 1267 

Qy 1359 NLARCSELTQSQASLQSAS SVGSARGDEGAGYTDVYGDYRPLF-DNPQDPD — NVSLQEV 1415 

II: I : I III: II: :: : 

Db 1268 ARRNRRA-FGDKQSCLRPFTEDDAADPNDSDIDPESR 1303 

Qy 1416 EAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSK7VLFSQILLPAFFVCVAMT 1474 

I : I I : I : I I : : I I : I I I I I I I I I : I I : I I : I I I I I I : I : 
Db 1304 ETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFAQIVLPAVFVCIALV 1363 



Qy 1475 VALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTF 1533 

: I I I I I I I I I : I I : : : I I :|:: 

Db 1364 FSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE DTGTLELLNAL 14 08 

Qy 1534 RLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPP 1593 

III: : I I : 
Db 1409 TKDPGFGTRCM EGNPI 1424 

Qy 1594 PSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVR C 1641 

II | | | | | | : | | : | : : : : | 

Db 1425 PD T P CQAG E E EWT TAP - VP QT I MD L FQN GNWTMQN P S P AC 14 63 

Qy 1642 TCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFTSDRF 1690 

II: II II III: I I I I I : I I I : I : I I : I : 

Db 1464 QCSSDKIKKMLPVCPPGAGGLPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIAKSLKN 1523 

Qy 1691 RLHRYGAI T FG — NVLKS I PAS FGT RAP PMVRK — 1721 

I I I : I I I : I : : I 

Db 1524 KIWVNEFRYGGFSLGVSNTQALPPSQEVNDATKQMKKHLKLAKDSSADRFLNSLGRFMTG 1583 

Qy 1722 I AVRRAAQVFYNNKGYHSMPT YLN SLNNAI LRANLP KS KGN PAAYGI TVTNH PMNKTS AS 1781 

: I : I : : LI | | : I : : : : I I : I I I I I I I I I I : I I : I I I I I I I : I I 
Db 1584 LDTRNNVKVWFNNKGWHAI S S FLNVINNAI LRANLQKGE-NPSHYGITAFNHPLNLTKQQ 1642 

Qy 1782 LS-LDYLLQGTDWIAIFIIVAMSFVPAS FWFLVAII1KSTKAKHLQFVSGCNPIIYWLAN 1840 

II: : I I ::: I : I I I I I I I I I I I I I I : I :: I I I I I I I : I I I : I I I I : I 

Db 1643 LS EVAPMTTS VDVLVS I CVI FAMS FVPAS FWFLI QERVS KAKHLQFI S GVKPVI YWLSN 1702 

Qy 1841 YVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSITPIMYPASFWFEV 1900 

: I I II I I : I I II : I I I : I I I I I : I I I I I I I I I I : I I I I I I I : : 
Db 1703 FVWDMCNYWPATLVI 1 1 FI CFQQKS YVS STNLPVLALLLLLYGWSITPLMYPAS FVFKI 1762 

Qy 1901 PSSAYVFLIVrNLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIFPNYNLGHGLME 1960 

I I : I I I I : I I I I I I : I I I I : I : I I I I : I I I I I I I I I : : I I I I : : 
Db 1763 PSTAYWLTSVNLFIGINGSVATFVLELFT-DNKLNNINDILKSVFLIFPHFCLGRGLID 1821 

Qy 1961 MAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQR 2020 

I I : : : : I : : : II II : I I I I I I I I I II I I : I : : I I I I I : 
Db 1822 MVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVLIQYRFFIRPRP 1880 

Qy 2021 MPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVR 2079 

: - I : I I I I 1111:11 I I : : : I : I I I : I : : I I I I I : I : I : 
Db 1881 VNAKLS PLNDEDEDVRRERQRI LDGGGQNDI LEI KELTKI YRRK RKPAVDRICVGIP 1937 

Qy 2080 PGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDAL 2139 

I I II I I I I I I I I I I I : I I I I I I I I I : I |:||:| :|:| : :| |::||lll l|: 
Db 1938 PGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCPQFDAI 1997 

Qy 2140 FDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAI 2199 

: II Ml:: : | | | : | : : I : I I : I I I I I : I I I I I I I I If I I I I I : 
Db 1998 TELLTGREHVEFFALLRGVPEKEVGKVGEWAI RKLGLVKYGEKYAGNYSGGNKRKLSTAM 2057 

Qy 2200 ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTGRSWLTSHSMEECEALCTRLAIM 2259 

I I I I I : I I I I I I I I I I I I I I I I I I I I ::| I I I II I I I I I I I I I I I I I I I : I I I 
Db 2058 ALIGGPPWFLDEPTTGMDPKARRFLWNCALSWKEGRSWLTSHSMEECEALCTRMAIM 2117 



Qy 2260 
Db 2118 

Qy 2319 YQLKS EHI S LAQVFS KMEQVS GVLGI EDYSVSQTTLDNVFVNFAKKQS DN 2368 

Ml I I I I : : I I : I I I I I I I I II I I II I I I I I I I III: 

Db 217 8 YQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 2227 

RESULT 4 
ABC1 MOUSE 



ID ABC1_M0U.SE STANDARD; PRT; 2261 AA. 

AC P41233; 

DT 01-FEB-1995 (Rel. 31, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE ATP-binding cassette, sub-family A, member 1 (ATP-binding cassette 

DE transporter 1) (ATP-binding cassette 1) (ABC-1) . 

GN ABCA1 OR ABCl . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=DBA/2; TISSUE=Macrophage; 

RX MEDLINE=94375008; PubMed=8088782 ; 

RA Luciani M.F., Denizot F. , Savary S., Mattei M.-G., Chimini G. ; 

RT "Cloning of two novel ABC transporters mapping on human chromosome 

RT 9 . " ; 

RL Genomics 21:150-159(1994). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RX MEDLINE-21251004; PubMed=l 1352567 ; 

RA Qiu Y.; Cavelier L., Chiu S., Yang X., Rubin E. , Cheng J.-F.; 

RT "Human and mouse ABCA1 comparative sequencing and transgenesis 

RT studies revealing novel regulatory sequences."; 

RL Genomics 73:66-76(2001). 

CC -!- FUNCTION: cAMP-dependent and sulf onylurea-sensitive anion 

CC transporter. Key gatekeeper influencing intracellular cholesterol 

CC transport (By similarity) . 

CC -!- TISSUE SPECIFICITY: Widely expressed in adult tissues. Highest 
CC levels are found in pregnant uterus and uterus. 

CC -!- DOMAIN: Multifunctional polypeptide with two homologous halves, 
CC each containing an hydrophobic membrane-anchoring domain and an 

CC ATP binding cassette (ABC) domain. 

CC -!- PTM: Phosphorylation on Ser-2054 regulates phospholipid efflux (By 
CC ' similarity) . 

CC -!- SIMILARITY: Belongs to the ABC transporter family. ABC A subfamily. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 



VNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEAMLKERHHTKVQ 2318 
I I I I I I II I : I I I II I I I I I I I I I I : : I I | | I | : : I I : I : I 
VNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDFFGLAFPGSVPKEKHRNMLQ 2177 



cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 



or send an email to license@isb-sib . ch) . 



EMBL; X75926; CAA53530.1; ALT_INIT. 
EMBL; AF287263; AAG39073.1; ALTINIT. 
MGD; MGI: 99607; Abcal. 

GO; GO: 0008203; P : cholesterol metabolism; IDA. 

GO; GO: 0030301; P : cholesterol transport; IDA. 

InterPro; IPR003593; AAA_ATPase. 

InterPro; IPR003439; ABC_transporter . 

Pfam; PF00005; ABC_tran; 2. 

ProDom; PD000006; ABCtransporter ; 2. 

SMART; SM00382; AAA; 2. 

PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 
PROSITE; PS50893; ABC_T RAN S PORT ER_2 ; 2. 

ATP-binding; Glycoprotein; Transmembrane; Transport; Phosphorylation. 



FT 


TRANSMEM 


26 


42 


POTENTIAL 






FT 


TRANSMEM 


640 


656 


POTENTIAL 






FT 


TRANSMEM 


690 


706 


POTENTIAL 






FT 


TRANSMEM 


717 


733 


POTENTIAL 






FT 


TRANSMEM 


749 


765 


POTENTIAL 






FT 


TRANSMEM 


771 


787 


POTENTIAL 






FT 


TRANSMEM 


1041 


1057 


POTENTIAL 






FT 


TRANSMEM 


1351 


1367 


POTENTIAL 






FT 


TRANSMEM 


1661 


1677 


POTENTIAL 






FT 


TRANSMEM 


1708 


1724 


POTENTIAL 






FT 


TRANSMEM 


1737 


1753 


POTENTIAL 






FT 


TRANSMEM 


1775 


1791 


POTENTIAL 






FT 


1 RAN5MEM 


1 O C A 

1854 


1870 


POTENTIAL 






FT 


NP BIND 


C\ o 

933 


94 0 


ATP (POTENTIAL) . 




FT 


NP BIND 


194 6 


1953 


ATP (POTENTIAL) . 




FT 


MOD RES 


1042 


1042 


PHOSPHORYLATION (BY 


PKA) (BY SIMILARITY) . 


TTTp 

r 1 


MUD Kkb 




2LJ54 


PHOSPHORYLATION (BY 


PKA) (BY SIMILARITY) . 


FT 


LAKbOrlYD 


T A 
14 


n a 


. N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


t," r i 


LAKbOi-iYD 


9o 


9o 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


C i 


LAKBOrlYD 




151 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


TTTp 


LAKdUHiD 


1 /-I 

Ibl 


161 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


^/A.r\Ij Wl 1 I L* 


1 Q£ 

JL D 


1 Q{T 

i y d 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


244 


244 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


292 


292 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


337 


337 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


349 


349 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


400 


400 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


478 


478 


N-LINKED 


(GLCNAC. . 


. ) ( POTENTIAL) . 


FT 


CARBOHYD 


489 


489 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


521 


521 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


820 


820 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


1144 


1144 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


1294 


1294 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


1453 


1453 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


1499 


1499 


N-LINKED 


(GLCNAC . . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


1504 


1504 


N-LINKED 


(GLCNAC . . 


.) (POTENTIAL). 


FT 


CARBOHYD 


1637 


1637 


N-LINKED 


(GLCNAC. . 


. ) ( POTENTIAL) . 


FT 


CARBOHYD 


2044 


2044 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


2238 


2238 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


CONFLICT 


1567 


1568 


MISSING (IN REF. 2) . 




FT 


CONFLICT 


2024 


2024 


MISSING (IN REF. 2) . 




SQ 


SEQUENCE 


2261 


AA; 254011 


MW; FAE62B21FD1D09F9 


CRC64; 



Query Match 33.1%; Score 4195.5; DB 1; Length 2261; 

Best Local Similarity 39.3%; Pred. No. 9.5e-247; 

Matches 990; Conservative 350; Mismatches 724; Indels 457; Gaps 62; 

Qy 6 QLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVP FYTAAPLTSA 65 

I I : I I I I I I : I : I I I I : I I : I I I : : I I I I I : I I 

Db 6 QLRLLLWKNLTFRRRQTCQLLLEVAWPLFIFLILISVRLSYPPYEQHECHFPNKA-MPSA 64 

Qy 66 GILPVMQSLCPDGQRDEFGFL QYANSTVTQLLERLDRWEEGNLF DP 112 

I I I : I : : I : : I I : : I I : : I : | 

Db 65 GTLPWVQGIICNANNPCFRYPTPGEAPGWGNFNKSIVSRLFSDAQRLL LYSQRDT 120 



Qy 113 ARPSLGSELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQNLSL 172 

: : I I I I : I : I : : I I I I I I 

Db 121 SIKDMHKVLRMLRQ IKHPNSNLKLQDFLVDNETFSGFLQHNLSL 164 

QV 173 PNSTAQALLAARVDPPEVY HLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRME 226 

I I I : I I I : I : II : I I : : | 

Db 165 PRSTVDSLLQXNVGLQKVFLQGYQLHL ASLCNGS KLE 201 

Qy 227 ELLLAPTUjLEQLTCTPGSGELGRILTVPESQKGTVLQGYRDAVCSGQAAARARRFSGLSAE 286 

I:: II I |: : :| : || | 
Db 202 EII QL GDAEVSALCGLPRKKL DAAERV 228 

Qy 2 87 LRNQLDVAK-VSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDA 330 

I I : I : I I : I : I | : | III: 

Db 229 LRYNMDI LKPWTKL NSTSHLPTQHLAEATTVLLDSLGGLAQELFSTKSWS 279 

Qy 331 QKVLQDVDVLSALALLLPQGACTGRTPGPPASGAGGAAN GT GAGAVMG PNAT 382 

1 = 1: : I I : : I : III : I : I I I 

Db 280 DMRQEVMFLTMVNSSSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNNT 339 

Qy 383 AEEGAPSAAALATP DTLQGQCSAFVQ — LWAGLQPILCGNNRTIEPEALRRGNMSSL 437 

I : II I :: I : : : I I : I : I I 
Db 340 EEDVDTFYDNSTTPYCNDLMKNLESSPLSRIIWKALKPLLVG 381 

Qy 438 GFTSKEQRNLGLLVHLMTSNPKILYAPAGSEVDRVILKANETFAFVGNVTHYAQVWLNIS 497 

I I I I I : | : : | : | | : : | : | 

Db 382 KILYTPDTPATRQVMAEVNKTFQELAVFHDLEGMWEELS 420 

Qy 498 AEIRSFLEQGRLQQHLRWL QQYVAELRLHPEAL- — NLS 533 

: I : I : I : : I I I : I I . : I I : I I 

Db 421 PQIWTFMENSQEMDLVRTLLDSRGNDQFWEQKLDGLDWTAQDIMAFLAKNPEDVQSPNGS 480 

Qy 534 LDELPPALRQDNFSLPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYT 593 

: 1:1 I : I I : I I I : : : : | | : : | : 

Db 481 VYTWREAFNETN QAIQTIS RFMECVNLNKLEPI PTEVRLINKS 523 

Qy 594 LNQAYQDN VTVFASVT FQ — TRKDGSLPPHVHYKI RQNS S FTEKTNEI RRAYWRPGPNTG 651 

: I : I :: I I I I I I I I I I : I : I I : I : I I I I I 

Db 524 ME — LLDERKFWAGI VFTGITPDS VELPHHVKYKI RMDI DNVERTNKI KDGYWDPGPRAD 581 

Qy 652 GRFYFLYGFVWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVIEHMM 708 

> II ::||::|:|M | : : I III Mill INI: I 

Db 582 PFEDMRYWGGFAYLQDVVEQAIIRVLTGSE-KKTGVYVQQMPYPCYVDDIFLRVMSRSM 640 



709 PLCMVISWVYSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWAWFITGFVQLSISVTAL 768 

I I I : : I : M I I : I : I I I I I I I I I I : | | | : | : | : | | : : : I : I I 
641 PLFMTLAWIYSVAVIIKSIVYEKEARLKETMRIMGLDNGILWFSWFVSSLIPLLVSAGLL 700 

769 TAI LK YGQVLMH S HWI I WL FLAVYAVAT IMFC FLVS VL Y S KAKLAS ACGG I 1 Y FL S YVP 828 

I I I I : I : I : : : : I I : I : I : I I : I I I : I I : I : I I I : I I I I II I I I : I 
701 VVILKLGNLLPYSDPSVVFVFLSVFA1WTILQCFLISTLFSRANLAAACGGIIYFTLYLP 760 

829 YMYVAIREEVAHDKITAFE-KCIASLMSTTAFGLGSKYFALYEVAGVGIQWHTFSQSPVE 887 

I : I I I I 111:1 I I I I : I I I I : I I : I : I I : I I I I 

761 YVLC VAWQ D YVG FS I K I FAS L L S P VAFG FGC E Y FAX FE EQGI GVQWDN L FE S P VE 815 

888 GDDFNLLIAWMLIWDAVVYGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGSGRTEAWE 94 7 

I Ml I 1:1:: I : I I :: I I I I I I I I I I I : I I | | | I I ||||. | | | 

816 EDGFNLTTAVSMMLFDTFLYGVMTWYIEAVFPGQYGIPRPWYFPCTKSYWFGE EIDE 872 

948 WSWPWARTPRLSVMEEDQACAMESRRFEETRGMEEEPTHLPLWCVDKLTKVYKDDKKLA 1007 

II: : I : I I I I I I I I I I I : I I I I : I I : I 

873 KSHPGSSQKGVS EIC MEEEPTHLRLGVSIQNLVKVYRDGMKVA 915 

1008 LNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKNL 1067 
: : I : I I II I : I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I : I I I I : I I 
916 VDGLALNFYEGQITSFLGHNGAGKTTTMSILTGLFPPTSGTAYILGKDIRSEMSSIRQNL 975 

1068 GMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLEL-SNKRHSLVQTLSG 112 6 
IMIIIIIIII I I I I I I : I I I : I I I :::: :: ||::| I : I :| I III 
976 GVCPQHNVLFDMLTVEEHIWFYARLKGLS EKHVKAEMEQMALDVGLP P S KLKS KT SQLS G 1035 

1127 GMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEAD 1186 

I I : I I I I I I : I M I I I : : I I I I I I I I I I I I : I I I I : I : I I I : I I I I : I I I I I I I I I I 
1036 GMQRKLSVALAFVGGSKWILDEPTAGVDPYSRRGIWELLLKYRQGRTIILSTHHMDEAD 1095 

1187 LLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPG 1229 

: I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I 
1096 ILGDRIAIISHGKLCCVGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNSSSTVSCLKKE 1155 

1230 GPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAA 1284 

: I I I I : I Mill: III I I : I : I I I I I 

1156 DSVSQSSSDAGLGSDHESDTLTIDVS — AISNLIRKHVSEARLVEDIGHELTYVLPYEAA 1213 

1285 KKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSEADVKESRKDVL 1344 
I : I I I M : : I I : I I : I : : I I I I I : I I I I : I I I I I : I 

1214 KEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFLKVAEE SGVDA-ETSDGTL 1266 

1345 PGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLF 1401 

I I I : I : | | : 

1267 P ARRNRRA FGDKQSCLHPF 1285 

1402 — DNPQDPD — NVSLQEVEAEALSRV-GQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSK 1456 
I : II: : : : I : I I : I : I I : I I I Mi ll II III: I 

1286 TEDDAVDPNDSDIDPESRETDLLSGMDGKGSYQLKGWKLTQQQFVALLWKRLLIARRSRK 1345 

1457 ALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYH-NYTQPRGNFIPYANEERRE 1515 

I : M : I I I I I I : I : : I II I I I I I I : II : : : I 

134 6 GFFAQIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWMYNEQYT FVSNDAPE 1397 

1516 YRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSM 1575 
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1398 



I I : I ■: : I I I : : I 

-DMGTQELLNALTKDPGFGTRCMEGNPIP- 



I : 

-DTP 1428 



1576 CLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEM WTSAPS 1630 

M : : I I I : I I : : : II 
1429 CL AGEED WT ISPVPQSIVDL FQN GNWTMKN P 1459 

1631 LPRLVREPVRCTCSAQGTGFS CPSSVGG-HPPQMRWTGDILTDITGHNVSEYLLFT 1686 

I IN: II Mill: I I I I : : I I I : I : I I : I 

1460 SP ACQCSSDKIKKMLPVCPPGAGGLPPPQRKQKTADILQNLTGRNISDYLVKT 1512 

1687 SDRF RLHRYGAITFG NVLKS I PAS FGTRAP P 1717 

: I II : I : | : : | | 

1513 YVQIIAKSLKNKIWVNEFRYGGFSLGVSNSQALPPSHEVNDAI KQMKKLLKLTKDTSADR 1572 



1718 MVRKIA- 



1573 



VRRAAQ VF YNN KG YH S M P T Y LN S LNN AI L RAN L P K S K GN P AAY G I TV 1770 

: : : : I : : I I I I : I : : : : I I : I I I I I I I I I I : I I : I I I I 

FLSSLGRFMAGLDTKNNVKVWFNNKGWHAISSFLNVINNAILRANLQKGE-NPSQYGITA 1631 



1771 TNHPMNKTSASLS-LDYLLQGTDWIAIFIIVAMSFVPAS FWFLVAEKSTKAKHLQFVS 1829 

111:11 II: : I I : : : I : I I I I II I I I I I I I I : I : : I I I I I I I : I 

1632 FNHPLNLTKQQLSEVALMTTSVDVLVSICVIFAMSFVPASFWFLIQERVSKAKHLQFIS 1691 

1830 GCN P 1 1 YWLAN YVWDMLN YLVPATCCVI I L FVFDLP AYT S PTN FPAVLS LFLL YGWS I T P 1889 

I I : I I I I : I : I I I I I I : I I I I : I I I : I I II I : I MINIMI 
1692 GVKPVIYWLSNFVWDMCNYWPATLVIIIFICFQQKSYVSSTNLPVLALLLLLYGWSITP 1751 

1890 IMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLIF 1949 

M M I I I | : : | | : | | | | : M I I I I : I I I I : I : I I : : I I : I Ml MM 
1752 LMYPAS FVFKIPSTAYVVLTSVNLFIGINGSVATFVLELFTNNK-LNDINDILKSVFLIF 1810 

1950 PNYNLGHGLMEMAYNEYINEYYAJCIGQFDKMKSPFEWDIWRGLVAMAVEGVVGFLLTIM 2009 

I : : I I I I : : I I : : : : I : : : II I I : I I I I I I I I I | | | | : | : : 
1811 PHFCLGRGLIDMVKNQAMADALERFGE-NRFVSPLSWDLVGRNLFAMAVEGWFFLITVL 1869 

2010 CQYNFLRRPQRMPVSTKPVED-DVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRI 2068 

M I IM: Mill! I I I I : I I I I : : : I : I I I : I : : I 
187 0 IQYRFFIRPRPVKAKLPPLNDEDEDVRRERQRILDGGGQNDILEIKELTKIYRRK RK 1926 

2069 LAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQ 2128 

II II : I : I : I I I I I I I I I M I I I I :: I I I I I I I I I MINI MM : : I I 
1927 PAVDRICIGIPPGECFGLLGVNGAGKSTTFKMLTGDTPVTRGDAFLNKNSILSNIHEVHQ 1986 

2129 SLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYS 2188 

: : I I I M I I : : I I I I I :: : III: I : : MM II I M M I II 
1987 NMGYCPQFDAITELLTGREHVEFFALLRGVPEKEVGKFGEWAIRKLGLVKYGEKYASNYS 2 04 6 

2189 GGNKRKLSTAIALI GYPAFI FLDEPTTGMDPKARRFLWNLILDLI KTGRSWLTSHSMEE 2248 

M I M M I I I : M I I I : II II II I I I I I I I :: I MM II I 

2047 GGNKRKLSTAMALIGGPPWFLDEPTTGMDPKARRFLWNCALSIVKEGRSWLTSHSMEE 2106 

224 9 CEALCTRIAIMVNGRLRCLGSIQHLKNRFGDGYMITVR-TKSSQSVKDWRFFNRNFPEA 2307 

I I I I I II M II II II I II II : I II II I I I II I I I I I : Ml II ||: 
2107 CEALCTRMAIMVNGRFRCLGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQEFFGLAFPGS 2166 



2308 MLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSD 
: I I I : I MINI I I I : : I I : | I II I I II I II I I I I I II I I I I M 



2367 



Db 2167 VLKEKHRNMLQYQLPSSLSSLARIFSILSQSKKRLHIEDYSVSQTTLDQVFWFAKDQSD 2226 



Qy 2368 N 2368 

Db 2227 D 2227 



RESULT 5 
ABCR_HUMAN 

ID ABCR_HUMAN STANDARD; PRT; 2273 AA. 

AC P78363; 015112; 060438; 060915; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel.- 39, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Retinal-specific ATP-binding cassette transporter (RIM ABC 

DE transporter) (RIM protein) (RMP) (Stargardt disease protein) . 

GN ABCA4 OR ABCR. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A., VARIANTS STGD, AND VARIANTS HIS-846 AND GLN-943. 

RX MEDLINE=97207641; PubMed=9054934 ; 

RA Allikmets R. , Singh N., Sun H., Shroyer N.F., Hutchinson A., 

RA Chidambaram A. , Gerrard B., Baird L., Stauffer D., Peiffer A., 

RA Rattner A., Smallwood P.M., Li Y., Anderson K.L., Lewis R.A., 

RA Nathans J., Leppert M. , Dean M., Lupski J.R.; 

RT "A photoreceptor cell-specific ATP-binding transporter gene (ABCR) is 

RT mutated in recessive Stargardt macular dystrophy."; 

RL Nat. Genet. 15:236-246(1997). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=97345663; PubMed=92 02155 ; 

RA Azarian S.M., Travis G.H.; 

RT "The photoreceptor rim protein is an ABC transporter encoded by the 

RT gene for recessive Stargardt's disease (ABCR)."; 

RL FEBS Lett. 409:247-252(1997). 

RN [3] 

RP SEQUENCE FROM N.A. , AND VARIANTS STGD TRP-18 AND CYS-212. 

RX MEDLINE=98163759; PubMed=9503029; 

RA Gerber S., Rozet J.-M., van de Pol T.J.R., Hoyng C.B., Munnich A., 

RA Blankenagel A., Kaplan J., Cremers F.P.M.; 

RT "Complete exon-intron structure of the retina-specific ATP binding 

RT transporter gene (ABCR) allows the identification of novel mutations 

RT underlying Stargardt disease."; 

RL Genomics 48:139-142 (1998) . 

RN [4] 

RP SEQUENCE FROM N.A., AND VARIANTS STGD. 

RX MEDLINE-98141123; PubMed=94 90294 ; 

RA Nasonkin I., Illing M. , Koehler M.R., Schmid M. , Molday R.S., 

RA Weber B.H.F. ; 

RT "Mapping of the rod photoreceptor ABC transporter (ABCR) to Ip21-p22.1 

RT and identification of novel mutations in Stargardt's disease."; 

RL Hum. Genet. 102:21-26(1998). 

RN [5] 

RP CHARACTERIZATION. 



RX MEDLINE-99175213; PubMed=10075733 ; 

RA Sun H., Molday R.S., Nathans J.; 

RT "Retinal stimulates ATP hydrolysis by purified and reconstituted ABCR, 

RT the photoreceptor-specific ATP-binding cassette transporter 

RT responsible for Stargardt disease."; 

RL J. Biol. Chem. 274:8269-82 81(1999). 

RN [6] 

RP DISEASE . 

RX MEDLINE=98133912; PubMed-9466990 ; 

RA Cremers F.P.M., van de Pol D.J.R., van Driel M.A., den Hollander A.I,, 

RA van Haren F.J. J., Knoers N.V.A.M., Tijmes N ., Bergen- A.A. B . , 

RA Rohrschneider K., Blankenagel A., Pinckers A.J.L.G., Deutman A.F., 

RA Hoyng C.B. ; 

RT "Autosomal recessive retinitis pigmentosa and cone-rod dystrophy 

RT caused by splice site mutations in the Stargardt 1 s disease gene 

RT ABCR."; 

RL Hum. Mol. Genet. 7:355-362(1998). 

RN [7] 

RP VARIANTS ARMD2, AND VARIANTS. 

RX MEDLINE=97442530; PubMed=9295268 ; 

RA Allikmets R. , Shroyer N.F., Singh N., Seddon J.M., Lewis R.A., 

RA Bernstein P.S., Peiffer A. , Zabriskie N.A., Li Y., Hutchinson A., 

RA Dean M. , Lupski J.R., Leppert M. ; 

RT "Mutation of the Stargardt disease gene (ABCR) in age-related macular 

RT degeneration."; 

RL Science 277:1805-1807(1997). 

RN [8] 

RP VARIANTS STGD TRP-18; CYS-212; HIS-636; MET-1019; VAL-1038; CYS-1108; 

RP TRP-1640; SER-1977 AND HIS-2107, AND VARIANTS FFM PRO-11; PRO-541; 

RP VAL-1038; GLU-1091; CYS-1508; PHE-1970 AND ARG-1971. 

RX MEDLINE=98454319; PubMed=97 8 1034 ; 

RA Rozet J.-M., Gerber S., Souied E., Perrault I., Chatelin S., Ghazi I., 

RA Leowski C, Dufier J.-L., Munnich A. , Kaplan J. ; 

RT "Spectrum of '■ABCR gene mutations in autosomal recessive macular 

RT dystrophies. "; 

RL Eur. J. Hum. Genet. 6:291-295(1998). 

RN [9] 

RP VARIANTS STGD. 

RX MEDLINE=99138655; PubMed=9973280; 

RA Lewis R.A., Shroyer N. F. , Singh N. , Allikmets R. , Hutchinson A., 

RA Li Y.> Lupski J.R., Leppert M. , Dean M. ; 

RT "Genotype/phenotype analysis of a photoreceptor-specif ic ATP-binding 

RT cassette transporter gene, ABCR, in Stargardt disease."; 

RL Am. J. Hum. Genet. 64:422-434(1999). 

RN [10] 

RP VARIANTS STGD, AND VARIANTS. 

RX MEDLINE=99192348; PubMed=100908 87 ; 

RA Maugeri A., van Driel M.A., van de Pol D.J.R., Klevering B.J., 

RA van Haren F.J. J., Tijmes N . , Bergen A.A.B., Rohrschneider K . , 

RA Blankenagel A., Pinckers A.J.L.G., Dahl N., Brunner H.G., 

RA Deutman A.F., Hoyng C.B., Cremers F.P.M.; 

RT "The 2588G — >C mutation in the ABCR gene is a mild frequent founder 

RT mutation in the western European population and allows the 

RT classification of ABCR Mutations in patients with Stargardt disease."; 

RL Am. J. Hum. Genet. 64:1024-1035(1999). 

RN [11] 

RP VARIANT STGD TYR-54, AND VARIANT ALA-863. 



RX MEDLINE=20077755; PubMed=l 0612508 ; 

RA Zhang K., Garibaldi D.C., Kniazeva M. , Albini T., Chiang M.F., 

RA Kerrigan M. , Sunness J.S., Han M. , Allikmets R. ; 

RT "A novel mutation in the ABCR gene in four patients with autosomal 

RT recessive Stargardt disease."; 

RL Am. J. Ophthalmol. 128:720-724(1999). 

RN [12] 

RP VARIANTS STGD VAL-60; ARG-206; ASN-300; PRO-541; ALA-849;' PRO-974; 

RP VAL-1038; CYS-1108; LEU-1408; ARG-1488; ASP-1652; PRO-1729; GLU-1961; 

RP TRP-2038; TRP-2077; HIS-2107; ARG-2128 AND TYR-2150. 

RX MEDLINE-99221420; PubMed=102 06579 ; 

RA Fishman G.A. , Stone E.M. , Grover S., Derlacki D.J., Haines H.L., 

RA Hockey R. R. ; 

RT "Variation of clinical expression in patients with Stargardt dystrophy 

RT and sequence variations in the ABCR gene."; 

RL Arch. Ophthalmol. 117:504-510(1999). 

RN [13] 

RP VARIANTS GLU-1961 AND ASN-2177. 

RX MEDLINE=20349288; PubMed-10880298 ; 

RA Allikmets R. , Tammur J., Hutchinson A., Lewis R.A., Shroyer N.F., 

RA Dalakishvili K., Lupski J.R., Steiner K. , Pauleikhoff D., Holz F.G., 

RA Weber B.H.F., Dean M. , Atkinson A., Gail M.H., Bernstein P.S., 

RA Singh N., Peiffer A., Zabriskie N . A. , Leppert M. , Seddon J.M. , 

RA Zhang K., Sunness J.S., Udar N.S., Yelchits S., Silva-Garcia R. , 

RA Small K.W., Simonelli F., Testa F., D'Urso M. , Brancato R., 

RA Rinaldi E . , Ingvast S., Taube A. , Wadelius C., Souied E . , Ducroq D., 

RA Kaplan J., Assink J.J.M., ten Brink J.B., de Jong P.T.V.M., 

RA Bergen A.A.B., Maugeri A., van Driel M.A. , Hoyng C.B., Cremers F.P.M., 

RA Paloma E. f Coco R. , Balcells S., Gonzalez-Duarte R. , Kermani S. f 

RA Stanga P., Bhattacharya S.S., Bird A.C.; 

RT "Further evidence for an association of ABCR alleles with age-related 

RT macular degeneration."; 

RL Am. J. Hum. Genet. 67:487-491(2000). 

RN [14] 

RP VARIANTS STGD GLU-60; THR-60; GLU-65; LEU-68; ARG-72; CYS-212; 

RP SER-230; SER-247; VAL-328; LYS-471; PRO-541; GLN-572; ARG-607; 

RP LYS-635; CYS-653; TYR-764; ARG-765; ALA-901; ILE-959; LYS-1036; 

RP VAL-1038; PRO-1063; ASP-1087; CYS-1097; CYS-1108; LEU-1380; LYS-1399; 

RP PRO-1430; VAL-1440; HIS-1443; LEU-1486; TYR-1488; MET-1537; PRO-1689; 

RP LEU-1705; THR-1733; ARG-1748; PRO-1763; LYS-1885; HIS-1898; GLU-1961; 

RP ARG-1975; SER-1977; GLY-2077; TRP-2077 AND VAL-2241, AND VARIANTS 

RP GLN-152; HIS-212; ARG-423; ILE-552; ARG-914; GLN-943; THR-1562; * 

RP ILE-1868; MET-1921; LEU-1948; PHE-1970; ALA-2059; ASN-2177 AND 

RP VAL-2216. 

RX MEDLINE=20442027; PubMed-10958763 ; 

RA Rivera A., White K. , Stoehr H., Steiner K. , Hemmrich N. f Grimm T., 

RA Jurklies B., Lorenz B., Scholl H.P.N., Apf elstedt-Sylla E., 

RA Weber B.H.F. ; 

RT "A comprehensive survey of sequence variation in the ABCA4 (ABCR) gene 

RT in Stargardt disease and age-related macular degeneration."; 

RL Am. J. Hum. Genet. 67:800-813(2000). 

RN [15] 

RP VARIANTS CORD3 GLU-65; CYS-212; PRO-541; ALA-863; GLY-863 DEL; 

RP VAL-1038; LYS-1122; TYR-1490 AND ASP-1598. 

RX MEDLINE=20442040; PubMed=10958761 ; 

RA Maugeri A., Klevering B.J., Rohrschneider K., Blankenagel A. , 

RA Brunner H.G., Deutman A.F., Hoyng C.B., Cremers F.P.M.; 



RT "Mutations in the ABCA4 (ABCR) gene are the major cause of autosomal 

RT recessive cone-rod dystrophy."; 

RL Am. J. Hum. Genet. 67:960-966(2000). 

RN [16] 

RP VARIANTS STGD ASP-340; GLN-572; ALA-863; SER-965; VAL-1038; ALA-1780 

RP AND HIS-1898, AND VARIANT GLN-943. 

RX MEDLINE=20208356; PubMed-1074 6567 ; 

RA Shroyer N.F., Lewis R.A. , Lupski J.R.; 

RT "Complex inheritance of ABCR mutations in Stargardt disease: linkage 

RT disequilibrium, complex alleles, and pseudodominance . " ; 

RL Hum. Genet. 106:244-24 8(2000). 

RN [17] 

RP VARIANTS STGD. 

RX MEDLINE=20098082; PubMed=10634594 ; 

RA Papaioannou M. , Ocaka L. , Bessant D., Lois N., Bird A.C., Payne A., 

RA Bhattacharya S.S.; 

RT "An analysis of ABCR mutations in British patients with recessive 

RT retinal dystrophies."; 

RL Invest. Ophthalmol. Vis. Sci. 41:16-19(2000). 

RN [18] 

RP VARIANTS STGD CYS-212; ASP-767; ILE-897; VAL-1038; LYS-1087; LYS-1399; 

RP GLN-1640 AND GLU-1961, AND VARIANT HIS-212. 

RX MEDLINE-20174852; PubMed=10711710 ; 

RA Simonelli F. , Testa F., de Crecchio G. , Rinaldi E., Hutchinson A., 

RA Atkinson A., Dean M. , D'Urso M. , Allikmets R. ; 

RT . "New ABCR mutations and clinical phenotype in Italian patients with 

RT Stargardt disease."; 

RL Invest. Ophthalmol. Vis. Sci. 41:892-897(2000). 

RN [19] 

RP CHARACTERIZATION OF VARIANTS, AND MUTAGENESIS OF GLY-966; LYS-969; 

RP GLY-1975 AND LYS-1978. 

RX MEDLINE-20472331; PubMed=11017087 ; 

RA Sun H., Smallwood P.M., Nathans J.; 

RT "Biochemical defects in ABCR protein variants associated with human 

RT retinopathies."; 

RL Nat. Genet. 26:242-24 6(2000). 

RN [20] 

RP VARIANT STGD ASN-972, AND VARIANTS GLN-943; ILE-1868 AND LEU-1948. 

Query Match 30.6%; Score 3875.5; DB 1; Length 2273;. 

Best Local Similarity 35.7%; Pred. No. 3e-227; 

Matches 910; Conservative 403; Mismatches 763; Indels 471; Gaps 58; 

QY 1 MGFLHQLQLLLWKNVTLKRRS PWVLAFEI FI PLVLFFI LLGLRQKKPTI SVKEVPFYTAA 60 

Ml: 1:1111111 ll::| |: II II :|: II I | | | | 

Db 1 MGFVRQIQLLLWKNWTLRKRQKIRFWELVWPLSLFLVLIWLRNANPLYSHHECHFPNKA 60 

QY 61 PLTSAGILPVMQSLCPDGQRDEF GFLQYANSTVTQLLERLDRWEEGNLFD 111 

: I I I : I I : I : : I I : I ::: I I : I : I : 
Db 61 -MPSAGMLPWLQGIFCNVNNPCFQSPTPGESPGIVSNYNNSI LARVYRDFQELLMNA 116 

QY H2 PARPSLG SELEALRQHLEALSAGPGTSGSHLDRSTVSSFSLDSVARNPQELWRFLTQ 168 

I I I . : I I I I : : I : | : | : : :: : | | | : 

Db 117 PESQHLGRIWTELHILSQFMDTLR THPERIAGRGI RI RDI LKDEETLTLFLI K 169 

QY 169 NLSLPNSTAQALLAARVDPPEVYHLLFGPSSALDSQSGLHKGQEPWSRLGGNPLFRMEEL 22 8 

| : | : | | : :: | | : | | | :::: 



Db 



170 NIGLSDSWYLLINSQVRPEQFAH- 



GVPDLALKDI 203 



Qy 229 LLAPALLEQLTCTPGSGELGRILTVPESQKGALQGYRDAVCSGQAAARARRFSGLSAELR 288 

: I I I I : : I I : : I I : I I : I 

Db 204 ACSEALLERF IIFSQRRGAKTVRYALCSLSQGT LQWIEDTLY 245 

Qy 289 NQLDVAKVSQQLGLDAPNGSDSSPQAPPPRRLQALLGDLLDAQPvVLQDV DVLSA 342 

: I I : : I I III I : | | : : : : : : | : | 

Db 246 ANVDFFKLFRVL— — PTLLDSRSQGINLRSWGGILSDM— SPRIQEFIHRPSMQDLLWV 299 

Qy 343 LAL L L P Q GACT GRT P G P PAS GAGGAAN GT GAGAVMG PN AT AE E GAP S AAALAT P DT LQ GQ 402 

I: I 1:1 

Db 300 TRPLMQNGG PET 311 

Qy 403 CSAFVQLWAGLQPILCGNNRTIEPEALRRGNMSSLGFTSKEQRN LGL 449 

I : I I : I I I II I II II II: 

Db 312 FTKLMGILSDLLCG YPEG GGSRVLSFNWYEDNNYKAFLGIDSTRKDPIY 360 

Qy 450 LVHLMTSNP K I L YAP AG S E VD RVI L KAN ET FAF 482 

I : : I I I I I I I I I :: I I I I 

Db 3 61 SYDRRTTSFCNALIQSLESNPLTKIAWRAAKPLLMGKILYTPDSPAARRILKNANSTFEE 42 0 

Qy 483 VGNVTHYAQVWLNI SAEI RS FLEQGRLQQHLR WLQQYVAELRLHPEALNL 532 

: : I : | : : | | : : | : | : : | : | | : 

Db 421 LEHVRKLVKAWEEVGPQIWYFFDNSTQr4NMIRDTLGNPTVKDFLNRQLGEEGIT7VEAILN 4 80 

Qy 533 SLDELPPALRQD NFSLPSGMALLQQLDT1DNAACGWIQFMSKVSVDIFKGFPDEESI 589 

I : I : I II : : : I I : : : : I I : : I I : 

Db 4 81 FLYKGPRESQADDMANFDWRDI FNITDRTLRLVN QYLECLVLDKFESYNDETQL 534 

I 

Qy 590 VNYTLNQAYQDNVTVFASVI FQTRK — DGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPG 647 

I : : : I : =1 hi I I I I I I I I I I : I I I I : I : II I 

Db 535 TQRALS - LLEENM- FWAGWFPDMYPWTS S LP PHVKYKI RMDI DWEKTNKI KDRYWDS G 5 92 

Qy 64 8 PNTGGRFYFLY GFVWIQDMMERAIIDTFVGHDWEPGSYVQMFPYPCYTRDDFLFVI 7 04 

I I I I I : : I I I : I : I : |. : I • I : I MM: I I : : : 

Db 593 PRADPVEDFRYIWGGFAYLQDMVEQGITRSQVQAE-APVGIYLQQMPYPCFVDDSFMIIL 651 

Qy 705 EHMMPLCMVT SWVYSVAMTIQHIVAEKEHRLKEVMKTMGLNNAVHWVAWFITGFVQLSI S 764 

I : I I : : I : I I I : I I I I I I I I I I I : I I : : I I I I I h I MM 
Db 652 NRCFPIFMVLAWIYSVSMTVKSIVLEKELRLKETLKNQGVSNAVIWCTWFLDSFSIMSMS 711 

Qy 765 WALTAILKYGQVLMHSHVVIIWLFLA 824 

: M : M : M M I : M I I :: I I I I I I I M M I I I I M I I M II 
Db 712 IFLLTIFIMHGRILHYSDPFILFLFLLAFSTATIMLCFLLSTFFSKASLAAACSGVIYFT 771 

Qy 825 S YVP YMYVAI REEVAHDKI TAFEKCI AS LMSTTAFGLGS KYFALYEVAGVGI QWHT FSQS 884 

his: : |::|| | |j:| M! MM M hhll I 
Db 772 LYLPHILCFAWQ DRMT7VELKKAVSLLSPVAFGFGTEYLVRFEEQGLGLQWSNIGNS 827 

Qy 885 PVEGDDFNLLIAVTML1WDAVWGILTWYIEAVHPGMYGLPRPWYFPLQKSYWLGS 944 

I M h h II:: h : M I I I I M II:: I II II I I II I I I M I I I I 
Db 828 P.TEGDEFSFLLSMQMMLLDAAVYGLLAWYLDQVFPGDYGTPLPWYFLLQESYWLGG : - 8 83 

Qy 945 AWEWSWPWARTPRLSVMEEDQACA-MESRRFEETRGMEEE PTHLPLV 990 

: I : II IM : I I II : 
Db 884 ■ EGCSTREERALEKTEPLTEETEDPEHPEGIHDSFFEREHP 923 



991 VCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPT 1045 

IN I h : : : I : : : I : : Mil: : I I I I I I I I I I I I : I I I I I I I II 
924 GWVPGVCVKNLVKIFEPCGRPAVDRLNITFYENQITAFLGHNGAGKTTTLSILTGLLPPT 983 

1046 SGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMD 1105 
M : : I I I I : I : I : : I I I I I I I I : I I I I I I I : I I : : I I : I I I : I I : 
984 SGTVLVGGRDIETSLDAVRQSLGMCPQHNILFHHLTVAEHMLFYAQLKGKSQEEAQLEME 1043 

1106 KMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDL 1165 

I : I I I : I I : I I I I I I : I I I I I I I I I I I :: : I I I I I I : I I I I I : I I : I I I I 
1044 AMLEDTGLHHKRNEEAQDLSGGMQRKLSVAIAFVGDAKWILDEPTSGVDPYSRRSIWDL 1103 

1166 ILKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRP 1225 

MIM IMMMMI I I I Ml I I I II MM Ml I I : I I [ I I : I I MM:: 
1104 LLKYRSGRTIIMSTHHMDEADLLGDRIAIIAQGRLYCSGTPLFLKNCFGTGLYLTLVRK- 1162 

1226 AEPGGPQEPGLASSPPGRAPLSSCSEL v QVSQFIRKH 1261 

: I I III : : : | 

1163 — MKNIQSQRKGSEGTCSCSSKGFSTTCPAHVDDLTPEQVLDGDVNE'LMDWLHH 1215 

1262 VASCLLVS DTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLK 1321 

I M M : : II : : I I : I I : I I : I I M I I I : M I I M I I I 

1216 VPEAKLVECI GQELI FLLPNKNFKHRAYASLFRELEETLADLGLS S FGI SDTPLEEI FLK 1275 

1322 VSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGS 1381 

MM I : : M : I I I I 
127 6 VTEDSDSGPLFAGGAQQKRENVNP--RHPCLGP 1306 

1382 ARG D E GAG YT D VYGD Y R PLFDNPQDP DN VS LQ E VEAEAL S RVGQ GSRKLDGG 1433 

I I I I II I : I II M I : I 

1307 REKAGQT PQDSNVCS PGAPAAHPEGQPPPEPECPGPQLNTGT 1348 

1434 WLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPS 14 93 

Is: I I I I I I Ml : M : I I I I I : M : : : : I M I I I I 

134 9 QLVLQHVQALLVKRFQHTIRSHKDFLAQIVLPATFVFLALMLSIVIPPFGEYPALTLHPW 1408 

1494 QYHNYTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSL 1553 

I ::| : M I I :| I 
1409 IY GQQYTFFSMDEPGSEQFT VLADVLLNKPG 1439 

1554 GPTLNLSSGESRLLAARFFDSMCL-ESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAW 1612 

I, = M I : II I II I I 
144 0 FGNRCLKEGWLPEYPCGN STPWKTPSVSP 14 68 

1613 NVS L P P TAG P EMWT SAP S L P RLVRE P VRCT C S AQGT G F SCPSSVGGHPPQMRW-TG 1668 

: I I I III: II I I I I I : 

1469 — NITQLFQKQKWTQVNPSP SCRCSTREKLTMLPECPEGAGGLPPPQRTQRST 1519 

1669 D I LT DI TGHNVS E YLL FT S DRFRLH — RYGAIT FGNVLKS I PAS , 1710 

: I I M I M M : M I : I : : I M I : I I : I : 

1520 EILQDLTDRNISDFLVKTYPALIRSSLKSKFWVNEQRYGGISIGGKLPWPITGEALVGF 1579 

1711 FGTRAP PMVRKI AVRRAAQVFYNNKGYHSMPT YLN S LNNAI L 1752 

I :: : : M : M I M M : : : M Mill 

1580 LSDLGRIMWSGGPITREASKEIPDFLKHLETEDNIKVWFNNKGWHALVSFLNVAHNAIL 1639 



Qy 1753 RANLPKSKGNPAAYGITVTNHPMNKTSASLS-LDYLLQGTDWIAIFIIVAMSFVPASFV 1811 

I I : I I I : : I I I I I I : I : I I I I : I I I : I I : I : I I I I I I I I I 
Db 1640 RASLPKDR-SPEEYGITVISQPLNLTKEQLSEITVLTTSVDAVVAI CVIFSMSFVPASFV 1698 

Qy 1812 VFLVAEKSTKAKHLQFVSGCNPIIYWIJ^NYVWDMLNYLVPATCCVIILFVFDLPAYTSPT 1871 

::|: I: l:IMII:|| :| I I : I : : I I : : I I I I II I Mill 
Db 1699 LYLIQERVNKSKHLQFISGVSPTTYWVTNFLWDIMNYSVSAGLWGIFIGFQKKAYTSPE 1758 

Qy 1872 NFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEH 1931 

I I I : : : I Mill:: I : ! | I M I I : I M : I I I | | || I I I : : Ihhllh 
Db 1759 NLPALVALLLLYGWAVIPMMYPASFLFDVPSTAWALSCANLFIGINSSAITFILELFEN 1818 

Qy 1932 DKDLKVWSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTR 1991 

: : I I : I : : : M : : M M : : : I : : : : I I : I : : : M M : : : 
Db 1819 NRTLLRFNAVLRKLLIVFPHFCLGRGLIDLALSQAVTDVYARFGE-EHSANPFHWDLIGK 1877 

Qy 1992 GLVAMAVEGVVGFLLTIMCQYNF LRRPQRMPVSTKPVEDDVDVAS ERQRVLRGDA 2046 

I M MM! Mil:: I M : | : |: I :M III Mil:: I 

Db 1878 NLFAMWEGWYFLLTLLVQRHFFLSQWIAEPTKEPI — ' — VDEDDDVAEERQRIITGGN 1933 

Qy 2047 DNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDES 2106 

I : : : : I M : I I I I I I I : I I I I I I I I I I I I I I I I I I I : I I I I I I I I : 

Db 1934 KTDILRLHELTKIYPGTSSP AVDRLCVGVRPGECFGLLGVNGAGKTTTFKMLTGDTT 1990 

Qy 2107 TTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARV 2166 

I Ml I I Ml : : I I :: M I I I M : : M M M M M M : :: M 
Db 1991 VTSGDATVAGKSILTNISEVHQNMGYCPQFDAIDELLTGREHLYLYARLRGVPAEEIEKV 2050 

Qy 2167 VKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLW 2226 

I : : : I II III I M I I I I I M I I I I M I II I I : I I I I I I I I I I : II I II 
Db 2051 ANWSIKSLGLTW7U)CIAGTYSGGNKRKLSTAI7VLIGCPPLVLLDEPTTGMDPQ7VRRMLW 2110 

Qy 2227 NLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVR 2286 

MM : I : I I : I I I I I I I I I I I I I I I I I I I I I I I I I : I : I I I I I : : I I I I M : I : : 
Db 2111 NVIVSIIREGRAWLTSHSMEECEALCTRLAIMVKGAFRCMGTIQHLKSKFGDGYIVTMK 2170 

Qy 2287 TKSSQ SVKDWRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGV 2341 

II: : I : I I I M : : : I I I : : I : I : I I I I : : I : 

Db 2171 IKSPKDDLLPDLNPVEQFFQGNFPGSVQRERHYNMLQFQVSSS — SLARIFQLLLSHKDS 2228 

Qy 2342 LGIEDYSVSQTTLDNVFVNFAKKQSDN 2368 

I I I : I I I : I I I I I I II II I I : I : : : 
Db 2229 LLIEEYSVTQTTLDQVFVNFAKQQTES 2255 



RESULT 6 
ABC 3 HUMAN 



ID ABC3_HUMAN STANDARD; PRT; 1704 AA. 

AC Q99758; Q92473; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE ATP-binding cassette, sub-family A, member 3 (ATP-binding cassette 

DE transporter 3) (ATP-binding cassette 3) (ABC-C transporter) . 

GN ABCA3 OR ABC3 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 



OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI JTaxI D=9 6 0 6 ; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Thyroid carcinoma; 

RX MEDLINE=96326608; PubMed-8 706931 ; 

RA Klugbauer N., Hofmann F. ; 

RT "Primary structure of a novel ABC transporter with a chromosomal 

RT localization on the band encoding the multidrug resistance-associated 

RT protein."; 

RL FEBS Lett. 391:61-65(1996). 
RN [2] 

RP SEQUENCE FROM N . A. 

RX MEDLINE=97179225; PubMed=9027511; 

RA Connors T.D., van Raay T.J., Petry L.R., Klinger K.W., Landes G.M., 

RA Burn T.C. ; 

RT "The cloning of a human ABC gene (ABC3) mapping to chromosome 

RT 16pl3. 3. "; 

RL Genomics 39:231-234(1997). 

CC -!- FUNCTION: May be a transporter, its natural substrate has not been 
CC found yet (By similarity) . May act as an efflux pump for 

CC chemotherapeutics drugs. 

CC -!- TISSUE SPECIFICITY: Highly expressed in lung, followed by brain, 
CC pancreas, skeletal muscle and heart. Weakly expressed in placenta, 

CC kidney and liver. Also expressed in medullary thyroid carcinoma 

CC cells (MTC) and in C-cell carcinoma. 

CC -!- DOMAIN: Multifunctional polypeptide with two homologous halves, 
CC each containing an hydrophobic membrane-anchoring domain and an 

CC ATP binding cassette (ABC) domain (By similarity) . 

CC -!- SIMILARITY: Belongs to the ABC transporter family. ABC A subfamily. 
CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U78735; AAC50967.1; 

DR EMBL; X97187; CAA65825.1; -. 

DR PIR; A59188; A59188. 

DR PIR; S71363; S71363. 

DR Genew; HGNC:33; ABC A3 . 

DR MIM; 601615; -. 

DR GO; GO:0016021; C:integral to membrane; TAS. 

DR GO; GO: 0005624; C:membrane fraction; TAS. 

DR GO; GO: 0004009; F : ATP-binding cassette (ABC) transporter acti . . .; TAS. 

DR - GO; GO: 0005215; F: transporter activity; TAS. 

DR GO; GO: 0009315; P:drug resistance; TAS. 

DR GO; GO: 0006810; P: transport; TAS. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABCtransporter . 

DR Pfam; PF00005; ABC_tran; 2. 

DR ProDom; PD000006; ABC__transporter ; 2. 

DR SMART; SM00382; AAA; 2. 

DR PROSITE; PS00211; ABC_TRANS PORTER 1; 1. 



DR 


PROSITE; 


PS50893; 


ABC_ 


TRANSPORTER 2; 2. 




KW 


ATP-binding; Transport 


; Transmembrane. 




FT 


TRANSMEM 


22 


42 


POTENTIAL. 




FT 


TRANSMEM 


249 


269 


POTENTIAL. 




FT 


TRANSMEM 


307 


327 


POTENTIAL. 




FT 


TRANSMEM 


344 


364 


POTENTIAL. 




FT 


TRANSMEM 


373 


393 


POTENTIAL. 




FT 


TRANSMEM 


405 


425 


POTENTIAL. 




FT 


TRANSMEM 


447 


467 


POTENTIAL . 




FT 


TRANSMEM 


925 


945 


POTENTIAL . 




FT 


TRANSMEM 


1100 


1120 


POTENTIAL. 




FT 


TRANSMEM 


1144 


1164 


POTENTIAL. 




FT 


TRANSMEM 


1183 


1203 


POTENTIAL. 




FT 


TRANSMEM 


1213 


1233 


POTENTIAL . 




FT 


TRANSMEM 


1245 


1265 


POTENTIAL. 




FT 


TRANSMEM 


1306 


1326 


POTENTIAL. 




FT 


NP_BIND 


566 


573 


ATP (POTENTIAL) 




FT 


NP_BIND 


1416 


1423 


ATP (POTENTIAL) 




FT 


CONFLICT 


36 


36 


P -> S (IN REF. 


2) . 


FT 


CONFLICT 


196 


196 


L -> P (IN REF. 


2) . 


SQ 


SEQUENCE 


1704 AA; 191387 MW; AF0098DAF7A04F5F 



Query Match 20.7%; Score 2622; DB 1; Length 1704; 

Best Local Similarity 34.0%; Pred. No. 4.3e-151; 

Matches 638; Conservative 317; Mismatches 556; Indels 364;, Gaps 45, 



Qy 


581 


Db 


108 


Qy 


628 


Db 


164 


Qy 


673 


Db 


224 


Qy 


727 


Db 


282 


Qy 


782 


Db 


342 


Qy 


842 


Db 


398 


Qy 


901 


Db 


457 


Qy 


961 


Db 


504 



I: :l II :| |:|:|: II I I : I 



QNSSFTEK TNEIRRAYWRPG PNTGGRFYFLYGFVWIQDMMERAI I 672 

I M I I : : II- 1:1 I ||: :| ::|||: 



- VQMFP YPC YTRDDFLFVI EHMMPLCMVI SWVYS VAMTI QH 726 
: : I I I I : III I : : : I | : : : | : | : : 



I I I I I I I I I : I I I ::: | | | | I : I : I I : : : I 



M : I : : I I I I : I I : I I I : I : I I I : I I : | : | | : | | | 



: I : I : I : I I : I :: ■ ": I I : I I I I II I III 



::|:|:||::|||:||| II :|:|:||ll : III I I I 



I I : : : : I I IN I : : I : I I :: : : I : I : I I I | | 



1019 QWSFLGHNGAGKTTTMSILTGLFPPTSGSATI YGHDIRTEMDEIRKNLGMCPQHNVLFD 1078 
I • I 1 I I I I I I I I I : I : I I I I I I I I II I I I : : I : I : I I I : I I : I I I I : : I I I 
560 QITVLLGHNGAGKTTTLSMLTGLFPPTSGRAYISGYEISQDMVQIRKSLGLCPQHDILFD 619 



1079 RLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAF 1138 
I I I I I I : I I : : I I : : : : : | : : | : : | : | : | : I I I I I : I I I I : I I 
620 NLTVAEHLYFYAQLKGLSRQKCPEEVKQMLHIIGLEDKWNSRSRFLSGGMRRKLSIGIAL 679 

1139 VGGSRAI I LDEPTAGVDP YARRAIWDLI LKYKPGRTI LLSTHHMDEADLLGDRI AI I SHG 1198 
: I I : : M I I M : I : I :|||||||: : | I I I : I : I I || I I M I I I I I M : : I 
680 IAGSKVLILDEPTSGMDAISRRAIWDLLQRQKSDRTIVLTTHFMDEADLLGDRIAIM7VKG 739 

1199 KLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFI 1258 
: I : I I I I I I I I I I I I : I I I I I I : : | J : 

740 ELQCCGS S LFLKQKYGAGYHMTLVKEP- HCNPEDISQLV 777 

1259 RKHVASCLLVSDTSTELSYILPSEAAKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEV 1318 
M : M 111:1111: : II II ||: I : : I I I ||:||| 

778 HHHVPNATLESSAGAELSFILPRESTHR— FEGLFAKLEKKQKELGIASFGASITTMEEV 835 

1319 FLKVSEEDQSLENSEADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASS 1378 
I I : I : I : I | : : : I I : I I : I I I 
836 FLRVGK LVDSSMDIQAIQ— LPALQ-- YQHERRASDWAVDSNL 874 

1379 VGSARGDEGAGYTDVYGDYRPLFDNPQDPDNVSLQEVEAEALSRVGQGSRKLDGGW-LKV 1437 
I • -M : I I I I : II: I I 

875 CGAMDPSDGIG ALIEEERTAV KLNTGLALHC 905 

1438 RQFHGLLVKRFHCARRNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHN 1497 
: I I —I: : I I : :|:|:| | :|: I : I I I I : :| 

906 QQ FWAMFLKKAAYSWREWKMVAAQVLVPLT^ — 963 

1498 YTQPRGNFIPYANEERREYRLRLSPDASPQQLVSTFRLPSGVGATCV-LKSPANGSLGPT 1556 

I I I I II 

964 GRTWPFSVPGTSQLGQQ 981 

1557 LNLSSGESRLLAARFFDSMCLESFTQGLPLSNFVPPPPSPAPSDSPASPDEDLQAWNVSL 1616 



982 LS- 



983 



1617 PPTAGPEMWTSAPSLPRLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITG 1676 

I : : I : I : I I II : 
984 EHLKDALQAEG QEPREVLGDL 1004 

1677 HNVSEYLLFTSDRFRLHRYGAITFG — NVLKSIPASFGTRAPPMVRKIAVRRAAQVFYNN 1734 

Is I: I : :: I I : I I I I : I :| | 

1005 EEFLIFRA SVEGGGFNERCLVAASF RDVGERTWNALFNN 1044 

1735 KGYHSMPTYLNSLNNAILRANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLLQG-TDV 1793 

: I I I I I : : I : | I I I I : I I : : : : | : | 

1045 QAYH S P AT ALAWDN L L F KLLCGPHA-SIWSNFPQPRSALQAAKDQFNEGRKGF 1098 

1794 VI AI FI I VAMS FVPAS FWFLVAEKS T KAKH LQ FVS GCN P 1 1 YWLAN YVWDMLN YLVPAT 1853 

lh 11:1: ::l : hh: :lll:||||| : :||: :||::::|:|: 
1099 DIALNLLFAMAFLASTFSILAVSERAVQAKHVQFVSGVHVASFWLSALLWDLISFLIPSL 1158 



Qy 1854 CCVI I L FVFDL PAYT S PTN FPAVL S L FLL YGW S I T P I MY PAS FW FE VP S S AYVFL I VI N L 1913 

: : : , I I : I : I : I I I I I I I : I I : I I : I : I : : I | | : | : 
Db 1159 LLLWFKAFDVRAFTRDGHMADTLLLLLLYGWAI I PLMYLMNFFFLGAATAYTRLT I FN I 1218 

Qy 1914 FIGITATVATFLLQLFEHDKDLKV— VNSYLKSCFLIFPNYNLGHGLMEMAYNEY 1966 

II MM: : I : : : I I I : I I : I I : | | 

Db 1219 LSGI ATFLMVTIMRIPAVKLEELSKTLDHVFLVLPNHCLGMAVSSF-YENYETRRY 1273 

Qy 1967 INEYYAKIGQFDKMKSPFEWDI— VTRGLVAMAVEGWGFLLTIMCQYNFLRRPQ 2019 

: Ml :: : I I I : : I I I : | : : | | : | : 

Db 1274 CT S S EVAAH YCKKYN I Q YQEN FYAWS APGVGRFVASMAAS GCAYLI LLFLI ETNLLQRLR 1333 

Qy 2020 RMPVSTKPVEDDVDVASERQRVLRGDADNDM VKIENLTKV 2059 

I I I I : M III II Ml I : : : I : I : I I 

Db 1334 GILCALRRRRTLTELYTRMPV LPEDQDVADERTRILAPSPDSLLHTPLIIKELSKV 1389 

Qy 2060 YKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSV 2119 

M I : I I II I I I I : I I I I I I M M II I I M II I I I I : II I Mill II : 
Db 1390 YEQRV — PLLAVDRLS LAVQKGEC FGLLGFNGAGKTTT FKMLTGEES LTS GDAFVGGHRI 14 4 7 

Qy 2120 LKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKY 2179 

I : I : I I I I I I I I I : I | I I : I | | | | I : | : | | | : 

Db 1448 SSDVGKVRQRIGYCPQFDALLDHMTGREMLVMYARLRGIPERHIGACVENTLRGLLLEPH 1507 

Qy 2180 ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSV 2239 

Ml. I II II II I I I II I I II I II I I II I I : I II I I Ml IM : : : I : : : 

Db 1508 ANKLVRTYSGGNKRKLSTGIALIGEPAVI FLDEPSTGMDPVARRLLWDTVARARESGKAI 1567 

Qy 2240 VLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHLKNRFGDGYMITVRTKS SQSVKDV 2296 

:: I M I I II II II I II I I I I I I : Mill II II :: II II : : : I ::: :: 
Db 1568 IITSHSMEECEALCTRLAIMVQGQFKCLGSPQHLKSKFGSGYSLRAKVQSEGQQEALEEF 1627 

Qy 2297 VRFFNRNFPEAMLKERHHTKVQYQLKSEHISLAQVFSKMEQVSGVLGIEDYSVSQTTLDN 2356 

Is I I : : I : : I III : I I : I I : I : M : I II I I I : M 

Db 1628 KAFVDLTFPGSVLEDEHQGMVHYHLPGRDLSWAKVFGILEKAKEKYGVDDYSVSQISLEQ 1687 

Qy 2357 VFVNFAKKQSDNLEQ 2371 

I M: I I I I : 
Db 1688 VFLS FAHLQP PTAEE 17 02 



RESULT 7 
CED7 CAEEL 



ID CED7_CAEEL STANDARD; PRT; 1704 AA. 

AC P34358; 076287; P34359; 

DT 01-FEB-1994 (Rel. 28, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE ABC transporter ced-7 (Cell death protein 7) . 

GN CED-7 OR C48B4.4. . 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea ; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxI D=62 3 9 ; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM C) , FUNCTION, AND MUTAGENESIS OF LYS-586; 

RP GLU-639 AND LYS-1417. 



RC STRAIN=Bristol N2; 

RX MEDLINE=98297348; PubMed=9635425 ; 

RA Wu Y.-C, Horvitz H.R.; 

RT "The C. elegans cell corpse engulfment gene ced-7 encodes a protein 

RT similar to ABC transporters."; 

RL Cell 93:951-960(1998). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-Bristol N2; 

RX MEDLINE=94150718; PubMed=7 906398 ; 

RA Wilson R., Ainscough R. , Anderson K., Baynes C, Berks M. , 

RA Bonfield J., Burton J., Connell M., Copsey T., Cooper J., Coulson A., 

RA Craxton M. , Dear S., Du Z., Durbin R. , Favello A., Fraser A., 

RA Fulton L., Gardner A., Green P., Hawkins T., Hillier L., Jier M. , 

RA Johnston L., Jones M. , Kershaw J., Kirsten J., Laisster N., 

RA Latreille P. f Lightning J., Lloyd C, Mortimore B., O'Callaghan M. , 

RA Parsons J., Percy C, Rifken L. , Roopra A., Saunders D., Shownkeen R. , 

RA Sims M., Smaldon N., Smith A., Smith M. , Sonnhammer E., Staden R. ; 

RA Sulston J., Thierry-Mieg J . , Thomas K. , Vaudin M. , Vaughan K., 

RA Waterston R. , Watson A., Weinstock L., Wilkinson-Sproat J., 

RA Wohldman P . ; 

RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 

RT elegans."; 

RL Nature 368:32-38(1994). 

RN [3] 

RP REVISIONS, AND ALTERNATIVE SPLICING. 

RA Durbin R. ; 

RL Submitted (OCT-2000) to the EMBL/ GenBank/DDB J databases. 

CC -!- FUNCTION: Functions in the engulfment of cell corpses during 

CC embryonic programed cell death to translocate molecules that 

CC mediate homotypic adhesion between cell surfaces of the dying and 

CC engulfing cells. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=3; 

cc Comment=Experimental confirmation may be lacking for some 

CC isoforms; 

CC Name=c; 

CC IsoId=P34358-l; Sequence=Displayed; 

CC Name=a ; 

CC IsoId=P34358-2; Sequence=VSP_000044 , VSP_000045; 

CC Name=b; 

CC IsoId=P34358-3; Sequence=VSP_000044 ; 

CC -!- TISSUE SPECIFICITY: Ubiquitous in embryos . Expressed in larval 
CC germline precursors. Expression in larvae and adults is seen in 

CC amphid sheath cells, pharyngeal-intestinal valve and pha'smid 

CC sheath cells. Low levels of expression are also seen in gonadal 

CC sheath cells. 

CC -!- DOMAIN: Multifunctional polypeptide with two homologous halves, 
CC each containing a hydrophobic membrane-anchoring domain and an ATP 

CC binding cassette (ABC) domain. 

CC -!- SIMILARITY: Belongs to the ABC transporter family. ABC A subfamily. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 



cc 


modified 


and this 


statement 


is not removed. Usage 


by and for commercial 


cc 


entities 


requires 


a license 


agreement (See http://www.isb-sib.ch/announce/ 


cc 
cc 

DR 


or send an email 


to license@isb-sib.ch) . 






EMBL; AF049142; 


AAC24116. 1; 








DR 


EMBL; Z29117; CAA82384.2; - 








DR 


EMBL; Z29117; CAA82383.2; - 








DR 


EMBL; Z29117; CAC42271.1; - 








DR 


PIR; T42749; T42749. 








DR 


WormPep ; 


C48B4 . 


4a 


; CE24856. 








DR 


WormPep; 


C48B4. 


4b; CE24857. 








DR 


WormPep; 


C48B4. 


4c 


; CE27867. 








DR 


GO; GO: 0016021; 


C 


: integral to membrane; NAS . 






DR 


GO; GO: 0004009; 


F: ATP-binding cassette (ABC) transporter acti. . .; NAS . 


DR 


GO; GO: 0008219; 


P 


:cell death; IMP. 






DR 


InterPro; 


IPR003593; AAA__ATPase . 






DR 


InterPro; 


IPR003439; ABC_transporter . 






DR 


Pfam; PF00005; 


ABC tran; 2. 








DR 


ProDom; PD000006; 


ABC transporter; 2. 






DR 


SMART; SM00382; 


AAA; 2. 








DR 


PROSITE; 


PS00211; 


ABC_T RAN S PORT ER_1 ; 2. 






DR 


PROSITE; 


PS50893; 


ABC TRANSPORTER 2; 2. 






KW 


ATP-binding; Transport; Transmembrane; Repeat; 


Glycoprotein; 


KW 


Alternative splicing. 
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POTENTIAL. 
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963 




983 


POTENTIAL. 






FT 


TRANSMEM 


1126 




1146 


POTENTIAL. 






FT 


TRANSMEM 


1176 




1196 


POTENTIAL. 






FT 


TRANSMEM 


1201 




1221 


POTENTIAL. 






FT 


TRANSMEM 


1234 




1254 


POTENTIAL. 






FT 


TRANSMEM 


1311 




1331 


POTENTIAL. 






FT 


NP_BIND 


580 




587 


ATP (POTENTIAL) . 






FT 


NP BIND 


1411 




1418 


ATP (POTENTIAL) . 






FT 


CARBOHYD 


126 




126 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


145 




145 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


359 




359 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


421 




421 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


427 




427 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


481 




481 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


678 




678 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


727 




727 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


899 




899 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


986 




986 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


1012 




1012 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


1045 




1045 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


1597 




1597 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


CARBOHYD 


1632 




1632 


N-LINKED (GLCNAC. 




(POTENTIAL) . 


FT 


VARSPLIC 


496 




508 


Missing (in isoform a 


and isoform b) . 


FT 










/FTId=VSP 000044. 






FT 


VARSPLIC 


992 




993 


Missing (in isoform a) 




FT 










/FTId=VSP_000045. 






FT 


MUTAGEN 


586 




586 


K->R: CELL CORPSES 


NOT 


ENGULFED. 


FT 


MUTAGEN 


639 




639 


E->G: CELL CORPSES 


NOT 


ENGULFED. 



FT MUTAGEN 1417 1417 K->R: SOME CELL CORPSES NOT ENGULFED. 

SQ SEQUENCE 1704 AA; 191411 MW; B7502A0B24507CFE CRC64; 



Query Match 12.0%; Score 1515; DB 1; Length 1704; 

Best Local Similarity 25.4%; Pred. No. 1.2e-83; 

Matches 529; Conservative 332; Mismatches 642; Indels 582; Gaps 75; 

Qy 447 LGLLVHLM TSNPKI LY- -APAGS EVDRVI LKAN ETFAFVGNVT 487 

I I I I : I : I I : I : : I I : I I : : I : I I : 

Db 36 LGP LVYLWKNADHT S S P ENI YDN FQVKGTVEDVFLESNFI KP I YKRWCLRS DVWGYT S 95 

Qy 488 HYAQVWLNISAEIRSFLEQGRLQQHLRWLQQYVAELRLHPEALNLSLDELPPALRQDNFS 547 

I : : : I I I I I : I : I : I : : I III 
Db 96 KDAAAKRTVDDLMKKFAE — RFQS AKLKLSVKN-ESSEEQLLTVLRND 140 

Qy 54 8 LPSGMALLQQLDTIDNAACGWIQFMSKVSVDIFKGFPDEESIVNYTLNQAYQDNVTVFAS 607 

I : : I : : I : I II : I II 
Db 141 LPMLNET FCAI NS YAAGV VF DEVDVTNKKLN 171 

Qy 608 VI FQTRKDGSLPPHVHYKIRQNSSFTEKTNEIRRAYWRPGPNTGGRF YFLYG 659 

I : I : | : | : : | : I I : | : 

Db 172 YRILLGKT-PEETWHLTETSYNPYGPSSGRYSRIPSSPPYWTSA 214 

Qy 660 FVWI QDMMERAI I DT FVGHDWE PGS YVQMFPYPCYTR DDFLFVIEHM 707 

I : I : I : : : I : I : :: I I I III: 
Db 215 FLTFQHAIESSFLSS VQSGAPDLPITLRGLPEPRYKTSSVSAFIDFFPFI 264 

Qy 708 MPLCMVISWVYSVAMTIQHI VAEKEHRLKEVMKTMGLNN AVHWVAWFITGFVQ 760 

I : : : I I : I : I : I : III: III I : II 

Db 265 WAFVTFINVIHITREIAAENHAVKPYLTAMGLSTFMFYA7VHV^4AFLKFFVI 316 

Qy 761 LSI SVT ALTAI LKYGQVLMHSHWI IWLFLAVYAVATIMFCFLVSVLY SKAKLASA 816 

I : | | : : : : : : | : : : | : : : | | : : | | 

Db 317 FLCSIIPLTFVMEF VSPAALIVTVLM — YGLGAVI FGAFVAS FFNNTNSAI KAI LV 370 

Qy 817 CGGIIYFLSYVPYMYVAI REEVAHDKITAFEKC-IASLMSTTAFGLGSKYFALYEVAGVG 875 

I : : I I : | | : I : I :: I : I I I I : : I 

Db 371 AWGAMIGISY KLRPEL— DQISS CFLYGLNINGAFALAVEAISDYMRRERE 419 

Qy 876 IQ-WHTFSQSPVEGDDFNLLLAWMLMVT)AVV^ 934 

: : I : I : I : I I : I : : I I : 
Db 420 LNLTNMFNDSSLH FSLGWALVMMIVDIL 447 

Qy 935 SYWLGSG RTEAWEWSWPWART PRLSVMEEDQACAMESRRFEETRG — 979 

I : I I I I : : I II I : I I : I I : I 

Db 44 8 — WMS I GALWDHI RTSA- DFS LRT L FD FEAP EDDENQT DGVTAQNT RI N EQYRN RV 501 

Qy 980 MEEEPTHLPLV ; VCVDKLTKVYKDDKKL 1006 

II: : : | | | : : : 

Db 502 RRSDMEIQMNPMASTSLNPPNADSDSLLEGSTEADGARDTARADII VT^NLVKIWSTTGER 561 

Qy 1007 ALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSATIYGHDIRTEMDEIRKN 1066 

I : : I I I I I I I I I I I I : I I I : I : I I : I I I I : I : I II:: 

Db 562 AVDGLSLRAVRGQCSILLGHNGAGKSTTFSSIAGIIRPTNGRITICGYDVGNEPGETRRH 621 

Qy 1067 LGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIEDLELSNKRHSLVQTLSG 1126 

: I I I I I : I I : I : I I I I I I II :::::: I ::: | :: | | : | | | 



Db 



622 IGMCPQYNPLYDQLTVSEHLKLVYGLKGAREKDFKQDMKRLLSDVKLDFKENEKAVNLSG 681 



Qy 1127 GMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKYKPGRTILLSTHHMDEAD 1186 

I I I I I I I : I I I :: I I I I I I I : I I I I : : | : : | I I I I I : I I : I I I I : 
Db 682 GMKRKLCVCMALIGDSEVVLLDEPTAGMDPGARQDVQKLVEREKANRTILLTTHYMDEAE 741 

Qy 1187 LLGDRIAIISHGKLKCCGSPLFLKGTYGDGYRLTLVKRPAEPGGPQEPGLASSPPGRAPL 1246 

I I I : I : I I I I i I : : I I : I I I I I : I : | : 
Db 742 RLGDWFIMSHGKLVASGTNQYLKQKFGTGYLLTW LDHNGDK 784 

Qy 1247 SSCSELQVSQFIRKHVASCLLVSDTST ELSYILPSEAAKKGAFERL 1292 

II :. : : : I I : : I I I I : I I I 

Db 785 RK MAVILTDVCTHYVKEAERGEMHGQQIEIILPE — ARKKEFVPL 827 

Qy 1293 FQHLE- RSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENS 1332 

I I I I I I : I I I I I I I : I I : : : : : : : 
Db 828 FQALEAIQDRNYRSNVFDNMPNTLKSQLATLEMRSFGLSLNTLEQVFITIGDK VDKA 884 

Qy 1333 EADVKESRKDVLPGAEGPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTD 1392 

1:11 : I I I : I : I I . I I 
Db 885 IASRQNSR ISHNSRNASEPSLKPAGYDTQSSTKSA 919 

Qy 1393 VYGDYRPLFDNPQDPDNVSLQEVEAE7VLSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCAR 1452 

i-ll ::l II: III : : I : I : I 

Db 920 — DSYQKLMD SQARGPEKSGVAKM VAQFISIMRKKFLYSR 957 

Qy 1453 RNSKALFSQILLPAFFVCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEE 1512 

I I I I : I : I : I : : : I I I : | 
Db 958 RN W AQ L FT Q VL IPIILLGL VGSLTTL KSN 986 

Qy 1513 RREYRLRLSPDASPQQLVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFF 1572 

: I III: : I : I I : : 
Db 987 --NTDQFSVRSLTPSGIEPSKWWRFENGTI 1015 

Qy 1573 DSMCLESFTQGLPLSN FVP P P P S PAP S D S PAS P D E D LQ AWN VS L P P TAG P EMWT S AP S L P 1632 

1:1 I : 

Db 1016 PEE -AANFE 1023 

Qy 1633 RLVREPVRCTCSAQGTGFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRL 1692 

: : : I : II : : : f | 
Db 1024 KILRKS GGF EVLNYNTKNPL 1043 

Qy 1693. HRYGAI T FGNVLKS I PAS FGTRAP PMVRKI AVRRAAQVFYNNKG YHSMPT YLNSLNNAI L 1752 

I : I I : . I I : : : I : I I : | | : : : | 
Db 1044 PNITKSL 1 G EMP PAT I GMTMN S DN L EAL FNMR Y YH VL PTLISMIN 1088 

Qy 1753 RANLPKSKGNPAAYGITVTNHPMNKTSASLSLDYLL — QGTDWIAIFIIVAMSFVPASF 1810 

Ml : : J : : III II I I I : : I : I : : I : : I 

Db 1089 RARLTGTVDAEISSGVFL ; YSKSTSNSNLLPSQLIDVLLAPMLILIFAMVTSTF 1141 

Qy 1811 WFLVAEKSTKAKHLQFVSGCNPIIYWIANYVWDMLNYLVPATCCVIILFVFDLPAYTSP 1870 

1:11: I:: : I I I : : I : I I : : I : : : | : | : |:| ||:| | : 
Db 1142 VMFLIEERTCQFAHQQFLTGISPITFYSASLIYDGILY— SLICLIFLFMF-LAFHWMY 1197 

Qy 1871 TNFPAVLSLFLLYGWSITPIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLL-QLF 1929 

: I : : J I : I I : I I I I : I I I I I I : : I I : I I I : : I 
Db 1198 DH LAI VI L FW FLY F FS S VP F I YAVS FL FQ S P S KAN VL L 1 1 WQWI S GAAL LAVFL I FMI F 1257 



Qy 


1930 


EHDKDLK--VWSYLKSCFLIFPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWD 
1 : 1 1 : 1 1 : : : 1 : 1 1 :: III I : | | 
NIDEWLKSILVNIFM FLLPS YAFGSAI IT INTY GMILPSEELMNWD 


1987 


Db 


1258 


1303 


Qy 


1988 


IVTRGLVAMAVEGWGFLLTIMCQYNFLRR— PQRMPV STKPVEDDV DVA 


2035 


Db 


1304 


: 1 1 1 1 1 : : 1 : 1 : 1 1 1 1 : : I : I : | : 

HCGKNAWLMGTFGVCSFALFVLLQFKFVRRFLSQVWTVRRSSHNNVQPMMGDLPVCESVS 


1363 


Qy 


2036 


SERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKT 
M:M 1 :: I : |::||| : || ||: ||| | 1 1 1 1 1 1 I I I I I I I I 
EERERVHRVNSQNSALVIKDLTKTF GRFTAVNELCLAVDQKECFGLLGVNGAGKT 


2095 


Db 


1364 


1418 


Qy 


2096 


STFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRL 
: 1 1 : 1 1 1 : 1 1 1 : 1 1 1 1 : 1 : 1 1 1 1 1 1 1 1 : 1 1 I I I : : : : 
TT FNI LTGQS FAS SGEAMI GGRDV-TELI SIGYCPQFDALMLDLTGRESLEILAQM 


2155 


Db 


1419 


1473 


Qy 


2156 


RGI-SWKDEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPT 
1 : : 1 : 1 : : 1 1 : : : 1 1 1 1 1 1 1 1 1 1 : 1 : 1 1 : 1 1 1 1 I 1 
HGFENYKAKAELI LECVGMI AHADKLVRFYSGGQKRKI S VGVALLAPTQMI I LDEPT 


2214 


Db 


1474 


1530 


Qy 


2215 


TGMDPKARRFLWNLILDLIKTGRS-WLTSHSMEECEALCTRLAIMVNGRLRCLGSIQHL 
1 : 1 1 1 1 1 1 : 1 1 : 1 : 1 : : 1 1 1 1 1 1 : 1 1 1 1 1 1 : 1 : 1 : : II : 1 I I I 
AGIDPKARREWELLLWCREHSNSALMLTSHSMDECEALCSRIAVLNRGSLIAIGSSQEL 


2273 


Db 


1531 


1590 


Qy 


2274 


KNRFGDGYMITVRTKSSOSVKDVVRFFNRNFPEAMLKF.RHHTK VDYOL-K^FHT c?t a 

1 : : 1 : 1 : 1 : I I : | : : | | | : : : | : | r : 
KSLYGNNYTMTLSLYEPNQRDMWQLVQTRLPNSVLKTTSTNKTLNLKWQIPKEKEDCWS 




Db 


1591 


1650 


Qy 


2330 


QVFSKMEQVSGVLGIEDYSVSQTTLDNVFVNFAKKQSDNLEQQET 2 374 




Db 


1651 


1 :: 1 1 : : 1 : : : 1 : : 1 : |: 1 1 I : 1 
AKFEWQALAKDLGVKDFILAQSSLEETFLRLAGLDEDQLDTHST 1695 





RESULT 8 
DRRA STRPE 



ID DRRAJSTRPE STANDARD; PRT; 330 AA. 

AC P32010; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Daunorubicin resistance ATP-binding protein drrA. 

GN DRRA. 

OS Streptomyces peucetius . 

OC Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales ; 

OC Streptomycineae; Streptomycetaceae; Streptomyces. 

OX NCBI_TaxID=1950; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=ATCC 29050; 

RX MEDLINE-92020891; PubMed-1924314 ; 

RA Guilfoile P.G., Hutchinson C.R.; 

RT "A bacterial analog of the mdr gene of mammalian tumor cells is 

RT present in Streptomyces peucetius, the producer of daunorubicin and 

- RT doxorubicin."; 

RL Proc. Natl. Acad. Sci. U.S.A. 88:8553-8557(1991). 

CC -!- FUNCTION: DRRA AND DRRB MAY ACT JOINTLY TO CONFER DAUNORUBICIN AND 



CC DOXORUBICIN RESISTANCE BY AN EXPORT MECHANISM. 

CC -!- SIMILARITY: Belongs to the ABC transporter family. ' 

CC = 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M73758; AAA74717.1; -. 

DR PIR; S27707; S27707. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR005894; Drr_ABC_transpt . 

DR Pfam; PF00005; ABC_tran; 1. 

DR ProDom; PD000006; ABC_transporter ; 1. 

DR SMART; SM00382; AAA; : 1. 

DR TIGRFAMs; TIGR01188; drrA; 1. 

DR PROSITE; PS00211; ABC_TRANSP0RTER_1 ; 1. 

DR PROSITE; PS50893; ABC_T RAN S PORT ER_2 ; 1. 

KW ATP-binding; Transport; Antibiotic resistance. 

FT NP__BIND 41 48 ATP (BY SIMILARITY) . 

SQ SEQUENCE 330 AA; 35700 MW; 582D66C90D54E6B9 CRC64; 



Query Match 3.2%; Score 405; DB 1; Length 330; 

Best Local Similarity 30.3%; Pred. No. 3.3e-17; 

Matches 110; Conservative 65; Mismatches 148; Indels 40; Gaps 8; 



Qy 


980 


MEEEPTHLPLWCVTJKLTKWKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILT 
1 : 1 1 : 1 1 1 1 : 1 : : 1 1 1 : 1 1 1 1 1 1 1 1 : 1 1 : : 1 
MNTQPTR AIETSGLVKVYNGTR- -AVDGLDLNVPAGLVYGILGPNGAGKSTTIRMLA 


1039 


Db 


1 


55 


Qy 


1040 


GLFPPTSGSATIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEE 

II |:| ::|||: :| | :|: : : |: : : || |:| ||: :„ 
TLLRPDGGTARVFGHDVTSEPDTVRRRI SVTGQYASVDEGLTGTENLVMMGRLQGYSWAR 


1099 


Db 


56 


115 


Qy 


1100 


IRREMDKMIEDLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYAR 
1 : : 1 : 1 : 1 1 : : 1 1 1 1 1 : 1 : 1 : 1 : 1 : 1 1 1 1 1 1 : 1 1 : 1 
ARERAAELIDGFGLGDARDRLLKTYSGGMRRRLDIAASIWTPDLLFLDEPTTGLDPRSR 


1159 


Db 


116 


175 


Qy 


1160 


RAIWDLI-LKYKPGRTILLSTHHMDEADLLGDRIAIISHGKLKCCGSPLFLKGTYGDG-Y 

: 1 1 : : 1 1 : 1 1 : 1 : : 1 1 1 1 1 1 1 1 1 : 1 1 1 : : 1 : 1 1 : 1 
NQWDIVI^vTDAGTTVljLTTQYLDEADQLADRIAVIDHGRVIAEGTTGELKSSLGSNVL 


1217 


Db 


176 


235 


Qy 


1218 


RLTLVKRPAEPGGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSY 


1277 


Db 


236 


Ml II :: | : | | : : 
RLRL HDAQS RAEAERLLSAELGYT I HRD SDPTALSAR 


272 


Qy 


1278 


ILPSEA7VKKGAFERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVS EEDQSLENSEA 

1 : 1 III : II 1 : : 1 : 1 1 1 1 : : :|;| I : 
IDDPRQGMRALAELSRTHLE VRSFSLGQSSLDEVFLALTGHPADDRSTEEAAE 


1334 


Db 


273 


325 


Qy 


1335 


DVK 1337 




Db 


326 


: 1 

EEK 328 





RESULT 9 
NODI RHILO 



ID NODI_RHILO STANDARD; PRT; 340 AA. 

AC P23703; Q8KJI6; 

DT 01-NOV-1991 (Rel. 20, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003' (Rel. 42, Last annotation update) 

DE Nod factor export ATP-binding protein I (Nodulation ATP-binding 

DE protein I) . 

GN NODI OR MLR6164. 

OS Rhizobium loti (Mesorhizobium loti) . 

OC Bacteria; Proteobacteria; Alphaproteobacteria ; Rhizobiales; 

OC Phyllobacteriaceae; Mesorhizobium. 

OX NCBI_TaxID=38 1 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=NZP 2213; 

RX MEDLINE=91067466; PubMed=2251131 ; 

RA Young C.A., Collins-Emerson J.M., Terzaghi E.A., Scott D.B.; 

RT "Nucleotide sequence of Rhizobium loti nodi."; 

RL Nucleic Acids Res. 18:6691-6691(1990). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=R7A; 

RX MEDLINE=21999272; PubMed-12 003951 ; 

RA Sullivan J.T., Tr zebiatowski J.R., Cruickshank R.W., Gouzy J., 

RA Brown S.D., Elliot R.M., Fleetwood D.J., McCallum N.G., Rossbach U., 

RA Stuart G.S., Weaver J.E., Webby R.J., de Bruijn F.J., Ronson C.W.; 

RT "Comparative sequence analysis of the symbiosis island of 

RT Mesorhizobium loti strain R7A. " ; 

RL J. Bacteriol. 184:3086-3095(2002). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAI N-MAFF3 03099; 

RX MEDLINE=21082930; PubMed=11214 968 ; 

RA Kaneko T., Nakamura Y. , Sato S., Asamizu E., Kato T., Sasamoto S., 

RA Watanabe A., Idesawa K., Ishikawa A., Kawashima K., Kimura T., 

RA Kishida Y. , Kiyokawa C, Kohara M . , Matsumoto M., Matsuno A., 

RA Mochizuki Y. , Nakayama S., Nakazaki N., Shimpo S., Sugimoto M. , 

RA Takeuchi C, Yamada M. , Tabata S.; 

RT "Complete genome structure of the nitrogen-fixing symbiotic bacterium 

RT Mesorhizobium loti."; 

RL DNA Res. 7:331-338(2000). 

CC -!- FUNCTION: Part of the ABC transporter complex nodU (TC 
CC 3. A. 1.102.1) involved in the export of LCO (lipo-chitin 

CC oligosaccharide) and a modified beta-1, 4-linked N- 

CC acetylglucosamine oligosaccharide. Responsible for energy coupling 

CC to the transport system. Therefore this complex is implicated in 

CC the nodulation induction process (By similarity) . 

CC -!- SUBUNIT: The complex is composed of two ATP-binding proteins 

CC (nodi) and two transmembrane proteins (nodJ) (Probable) . 

CC -!- SUBCELLULAR LOCATION: Inner membrane-associated (By similarity). 

CC -!- SIMILARITY: Belongs to the ABC transporter family. Nodi subfamily. 

CC — 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 



CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X55705; CAA39236.1; ALT_INIT. 

DR EMBL; AL672113; CAD31532.1; ALT_INIT. 

DR EMBL; AP.003008; BAB52501.1; 

DR PIR; S13590; S13590. 

DR HSSP; Q58663; 1G6H. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC__transporter . 

DR InterPro; IPR005978; ABC_transptNodI . 

DR Pfam; PF00005; ABC_tran; 1. 

DR ProDora; PD000006; ABC_transporter ; 1. 

DR SMART; SM00382; AAA; 1. 

DR TIGRFAMs; TIGR01288; nodi; 1. 

DR PROSITE; PS00211; ABC__T RAN S PORT ER_1 ; 1. 

DR PROSITE; PS50893; ABC_TRANSP0RTER_2 ; 1. 

KW Nodulation; Transport; Membrane; Inner membrane; ATP-binding; 

KW Complete proteome. 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



10; Gaps 2; 

Qy 2015 LRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRILAVDRL 2074 

III : I : | : : : : | | : I I I I : I I I 

Db 11 LRR LET P AI E RE S H GQT S AK S S VP D S AS T VAVD FAGVT K S Y :GNKIWDEL 60 

Qy 2075 CLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQSLGYCP 2134 

I I I I I I I I I I I I I I : : : I I I I II I : : : I I 

Db 61 SFSVASGECFGLLGPNGAGKSTIARMLLGMTCPDAGTITVLGVPVPARARLARRGIGWP 120 

Qy 2135 QCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTYSGGNKRK 2194 

III I I I I : I : : I I : I : I : II I : I I III II 

Db 121 QFDNLDQEFTVRENLLVFGRYFGMSTRQSEAVIPSLLEFARLERKADARVSELSGGMKRC 180 

Qy 2195 LSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSMEECEALCT 2254 

I : I II I I I : I I I I I I : I I I I : I : I : I : : : : I I : I I I I I I I 
Db 181 LTMT^RALINDPQLIVMDEPTTGLDPHT^HLIWERLRALLARGKTIILTTHFMEEAERLCD 240 

Qy 2255 RLAIMVNGRLRCLGSIQHL 2273 

II:: II I II 

Db 241 RLCVLEKGRNIAEGGPQAL 259 



NP BIND 


74 


81 


ATP 1 (By similarity) . 


NP BIND 


217 


224 


ATP 2 (By similarity) . 


CONFLICT 


10 


10 


D -> E (IN REF. 1) . 


CONFLICT 


14 


14 


L -> F (IN REF. 2) . 


CONFLICT 


23 


23 


S -> F (IN REF. 2) . 


CONFLICT 


37 


37 


A -> L (IN REF. 2) . 


CONFLICT 


97 


97 


T -> A (IN REF. 1) . 


CONFLICT 


129 


129 


F -> L (IN REF. 1) . 


CONFLICT 


167 


167 


D -> N (IN REF. 1) . 


> SEQUENCE 


340 AA; 


37428 MW; 5777722B28D130E4 CRC64; 


Query Match 




3.0%; 


Score 382; DB 1; Length 


Best Local Similarity 


36.7%; 


Pred. No. 8.8e-16; 


Matches 95; 


Conservative 


34; Mismatches 120; , Indel; 



RESULT 10 
NODI_RHIS3 

ID NODI_RHIS3 STANDARD; PRT; 304 AA. 

AC P72335; 

DT 15-DEC-1998 (Rel. 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Nod factor export ATP-binding protein I (Nodulation ATP-binding 

DE protein I) . 

GN NODI . 

OS Rhizobium sp. (strain N33). 

OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales; 

OC Rhizobiaceae; Rhizobium/ Agrobacterium group; Rhizobium. 

OX NCBI_TaxID=103798; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-96303537; PubMed=87 55627 ; 

RA Cloutier J., Laberge S . , Prevost D. , Antoun H.; 

RT "Sequence and mutational analysis of the common nodBCIJ region of 

RT Rhizobium sp. (Oxytropis arctobia) strain N33, a nitrogen-fixing 

RT microsymbiont of both arctic and temperate legumes."; 

RL Mol. Plant Microbe Interact. 9:523-531(1996). 

CC -!- FUNCTION: Part of the ABC transporter complex nodlJ (TC 

cc 3. A. 1.102.1) involved in the export of LCO (lipo-chitin 

CC oligosaccharide) and a modified beta-1, 4-linked N- 

cc acetylglucosamine oligosaccharide. Responsible for energy coupling 

cc to the transport system. Therefore this complex is implicated in 

CC the nodulation induction process (By similarity) . 

CC -!- SUBUNIT: The complex is composed of two ATP-binding proteins 

CC (nodi) and two transmembrane proteins (nod J) (Probable) . 

CC -!- SUBCELLULAR LOCATION: Inner membrane-associated (By similarity). 

CC -!- SIMILARITY: Belongs to the ABC transporter family. Nodi subfamily. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U53327; AAB16898.1; 

DR HSSP; Q58663; 1G6H. 

DR InterPro; IPR003593; AAA_AT Pa s e . 

DR InterPro; IPR003439; ABC_jtransporter . 

DR InterPro; IPR005978; ABC_transptNodI . 

DR Pfam; PF00005; ABC_tran; 1. 

DR ProDom; PD000006; ABC_transporter ; 1. 

DR SMART; SM00382; AAA; 1. 

DR TIGRFAMs; TIGR01288; nodi; 1. 

DR PROSITE; PS00211 ; AB C_T RAN S PORT ER_1 ; 1. 

DR PROSITE; PS50893; ABC_T RAN S PORT ER_2 ; 1. 

KW Nodulation; Transport; Membrane; Inner membrane; ATP-binding. 

FT NP^BIND 38 45 ATP (By similarity) . 

SQ SEQUENCE 304 AA; 33698 MW; 7C6A33B0364CCE14 CRC64 ; 



Query Match . 3.0%; Score 380; DB 1; Length 304; 

Best Local Similarity 41.4%; Pred. No. 9.7e-16; 

Matches 89; Conservative 31; Mismatches 93; Indels 2; Gaps 1; 

Qy 2055 NLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFV 2114 

: I I I I I I I : I I I I I I I I I I I I I I I : : : I I I I : I 

Db 7 DLAGVKKS — FGDKLWNGLSFTVASGECFGLLGPNGAGKSTIARMLLGMTVPDAGKITV 64 

Qy 2115 NGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKL 2174 

II ::hl II I I I I 11:1 :: I |:| : |: II 

Db 65 LGEPVGARSRLARKSIGWPQFDNLDQEFTVRENLLVFGRYFGMSTRKIKEVIPSLLEFA 124 

Qy 2175 ELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIK 2234 

I II I I I I I I : I : I I II I : : I I I I i I : I I I I : I : I : 
Db 125 RLESKADARVGELSGGMKRRLTLARALINDPQLLVMDEPTTGLDPHARHLIWERLRFLLA 184 

Qy 2235 TGRSWLTSHSMEECEALCTRLAIMVNGRLRCLGS 2269 

I : : : : I I : I I I I I I I I I : : : I I II 

Db 185 RGKTIILTTHFMEEAERLCDRLCVLEHGRKLAEGS 219 



RESULT 11 
NODI RHISN 



ID NODI_RHISN STANDARD; PRT; 343 AA. 

AC P55476; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Nod factor export ATP-binding protein I (Nodulation ATP-binding 

DE protein I) . 

GN NODI OR Y4HF. 

OS Rhizobium sp. (strain NGR234). 

OG Plasmid sym pNGR234a. 

OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales; 

OC Rhizobiaceae; Rhizobium/Agrobacterium group; Rhizobium. 

OX NCBI_Tax±D=394; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=97305956; PubMed=91 63424 ; 

RA Freiberg C.A., Fellay R. , Bairoch A., Broughton W.J., Rosenthal A., 

RA Perret X. ; 

RT ''Molecular basis of symbiosis between Rhizobium and legumes."; 

RL Nature 387:394-4 01(1997). 

CC -!- FUNCTION: Part of the ABC transporter complex nodlJ (TC 
CC 3. A. 1.102.1) involved in the export of LCO (lipo-chitin 

CC oligosaccharide) and a modified beta-1, 4-linked N- 

CC acetylglucosamine oligosaccharide. Responsible for energy coupling 

CC to the transport system. Therefore this complex is implicated in 

CC the nodulation induction process (By similarity) . 

CC -!- SUBUNIT: The complex is composed of two ATP-binding proteins 

CC (nodi) and two transmembrane proteins (nodJ) (Probable). 

CC -!- SUBCELLULAR LOCATION: Inner membrane-associated (By similarity). 

CC -!- SIMILARITY: Belongs to the ABC transporter family. Nodi subfamily. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 



CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AE000076; AAB91694.1; 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR005978; ABC_transptNodI . 

DR Pfam; PF00005; ABC_tran; 1. 

DR ProDom; PD000006; ABC_transporter ; 1. 

DR SMART; SM00382; AAA; 1. 

DR TIGRFAMs; TIGR01288; nodi; 1. 

DR PROSITE; PS00211; AB C_T RAN S PORT ER_1 ; 1. 

DR PROSITE; PS50893; ABC_T RAN S PORT ER_2 ; 1. 

KW Nodulation; Transport; Membrane; Inner membrane; ATP-binding; Plasmid. 

FT NPJBIND 77 84 ATP (By similarity) . 

SQ SEQUENCE 343 AA; 37917 MW; F4 9A7EC56E099A33 CRC64; 

Query Match 3.0%; Score 380; DB 1; Length 343; 

Best Local Similarity 35.2%; Pred. No. 1.2e-15; 

Matches 93; Conservative 44; Mismatches 109; Indels 18; Gaps 4; 

Qy 2006 LTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDM VKIENLTKV 2059 

: : : : I I I I I I : : : : I I I : I I I I I I 

Db 1 MQLLTRANVSSSPSRRPESN ALKQKCHGHSNADNSLSRSKSDVAIE-LTNV 50 

Qy 2060 YKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSV 2119 

II I : II : I : I I I I I I I I I I I I I : : : : : I I II I 

Db 51 SKS — YGDKVWDQLSFTITSGECFGLLGPNGAGKSTVSRLVLGLAPPDEGTITVLGEPV 108 

Qy 2120 LKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARVVKW7VLEKLELTKY 2179 

: : I II I J I I I I I : I : : I I : : : : I : I 

Db 109 PAR7VRLARSRIGWPQFDTLDREFTARENLLVFGRYFGLHTRELEEAIPPLLDFARLESK 168 

Qy 2180 ADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSV 2239 

III I I I : I : I : I I I I I : I I I I l-l I : I I I I : I : I : I : :: 

Db 169 ADVPVAQLSGGMQRRLTLAC7VLINDPQLLILDEPTTGLDPHARHLIWERLRSLLALGKTI 228 

Qy 2240 VLTSHSMEECEALCTRLAIMVNGR 2263 

: I I : I III : II II : : : I I 
Db 229 LLTTHFMEEADRLCDRLCVIEHGR 252 



RESULT 12 
NODI_RHIGA 

ID NODI_RHIGA STANDARD; PRT; 347 AA. 

AC P50332; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Nod factor export ATP-binding protein I (Nodulation ATP-binding 
DE protein I) . 
GN NODI . 

OS Rhizobium galegae. 

OC Bacteria; Proteobacteria; Alphaproteobacteria ; Rhizobiales; 



OC Rhizobiaceae; Rhizobium/Agrobacterium group; Rhizobium. 

OX NCBI_TaxID=399; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-HAMBI 1174; 

RX MEDLINE=99403395; PubMed=10474187 ; 

RA Suominen L., Paulin L., Saano A., Saren A.M., Tas E. , Lindstrom K. ; 

RT "Identification of nodulation promoter (nod-box) regions of Rhizobium 

RT galegae."; 

RL FEMS Microbiol. Lett. 177:217-223(1999). 

CC FUNCTION: Part of the ABC transporter complex nodi J (TC 

CC 3. A. 1.102.1) involved in the export of LCO (lipo-chitin 

CC oligosaccharide) and a modified beta-1, 4-linked N- 

CC acetylglucosamine oligosaccharide. Responsible for energy coupling 

cc to the transport system. Therefore this complex is implicated in 

CC the nodulation induction process (By similarity) . 

CC -!- SUBUNIT: The complex is composed of two ATP-binding proteins 

cc (nodi) and two transmembrane proteins (nodJ) (Probable) . 

CC -!- SUBCELLULAR LOCATION: Inner membrane-associated (By similarity). 

CC -!- SIMILARITY: Belongs to the ABC transporter family. Nodi subfamily. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities reguires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X87578; CAA60881.1; -. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR005978; ABC_transptNodI . 

DR Pfarn; PF00005; ABC_tran; 1. 

DR ProDorn; PD000006; ABC_transporter ; 1. 

DR SMART; SM00382; AAA; 1. 

DR TIGRFAMs; TIGR0128 8; nodi; 1. 

DR PROSITE; PS00211; ABC_TRANSP0RTER_1 ; 1. 

DR PROSITE; PS50893; ABC_TRANSP0RTER_2 ; 1. 

KW Nodulation; Transport; Membrane; Inner membrane; ATP-binding. 

FT NP_BIND 81 8 8 ATP (By similarity) . 

SQ SEQUENCE 347 AA; 38435 MW; AC791210C44C9A6C CRC64; 

Query Match 3.0%; Score 379; DB 1; Length 347; 
Best Local Similarity 31.3%; Pred. No. 1.4e-15; 

Matches 103; Conservative 56; Mismatches 130; Indels 40; Gaps 6; 

Qy 2019 QRMPVSTKPVEDDVDVASERQRVLR GDADNDMVKIENLTKVYKSRKIGR 2067 

: I : I : I : I I I I I : : : : | : | : : : 

Db 6 EREMLRPKTIAMDQNSASARSNPEREIKTGRLEPASNSAPTMAIDLQAVTMIYRDKTV — 63 

Qy 2068 ILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQ 2127 

M l I I I I I I I I I I I I I I I I : : : M I : : | : | | | : : 

Db 64 - VDSLSFGVRAGECFGLLGPNGAGKSTITRMLLGMATPSAGKISVTjGLPVPGKARLTVR 120 

Qy 2128 QSLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARWKWALEKLELTKYADKPAGTY 2187 
I : I III I I I I : I : : I : I : : : : | | : | | | 



Db 



121 ASIGWSQFDNLDMEFTVRENLLVFGRYFQMSTRAIEKLIPSLLEFAQLEAKADVRVSDL 180 



Qy 2188 SGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKARRFLWNLILDLIKTGRSWLTSHSME 2247 

I I I I I : I : I I I : I : I I I I I I I : I I I I : I : I : I : : : : I I : I I : 
Db 181 SGGMKRRLTLARALWDPQLLILDEPTTGLDPPARHQIWERLRSLLIRGKTILLTTHMMD 240 

Qy 2248 ECEALCTRLAIMVNGRLRCLG- S IQHLKNRFG — DGYMITVR TKS 2289 

I I : I I I : : I I :. I : : : : I | : | | | 

Db 241 EAERMCDRLCVLEGGRMIAEGPPLSLIEDIIGCPVI EVYGGNPDELSLIVRPHVDRIETS 300 

Qy 2290 SQSV KDWRFFNRNFPEAMLKER 2312 

: : : I II I II III 

Db 301 GETLFCYTVNSDQVRAKLREFPSLRLLER 329 



RESULT 13 
NDI2 RHIME 



ID NDI2_RHIME STANDARD; PRT; 335 AA. 

AC Q8GNH6; 

DT 10-OCT-2003 (Rel. 42, Created) 

DT 10-OCT-2003 (Rel. 42, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Nod factor export ATP-binding protein I (Nodulation ATP-binding 

DE protein I ) . 

GN NODI . 

OS Rhizobium meliloti ( Sinorhizobium meliloti) . 

OG Plasmid megaplasmid. 

OC Bacteria; Proteobacteria ; Alphaproteobacteria; Rhizobiales; 

OC Rhizobiaceae; Sinorhizobium/Ensif er group; Sinorhizobium. 

OX NCBI_TaxID=382; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=102L4; 

RX MEDLINE=22343004; PubMed=12455608 ; 

RA Barran L.R., Bromfield E.S., Brown D.C.; 

RT "Identification and cloning of the bacterial nodulation specificity 

RT gene in the Sinorhizobium meliloti-Medicago laciniata symbiosis."; 

RL Can. J. Microbiol. 48:765-771(2002). 

CC -!- FUNCTION: Part of the ABC transporter complex nodi J (TC 
CC 3. A. 1.102.1) involved in the export of LCO (lipo-chitin 

CC oligosaccharide) and a modified beta-1, 4-linked N- 

CC acetylglucosamine oligosaccharide. Responsible for energy coupling 

CC to the transport system. Therefore this complex is implicated in 

CC the nodulation induction process (By similarity) . 

CC -!- SUBUNIT: The complex is composed of two ATP-binding proteins 

CC (nodi) and two transmembrane proteins (nodJ) (Probable). 

CC -!- SUBCELLULAR LOCATION: Inner membrane-associated (By similarity). 

CC -!- SIMILARITY: Belongs to the ABC transporter family. Nodi subfamily. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bipinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 



DR EMBL; AF522456; AAN62904.1; 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR005978; ABC_transptNodI . 

DR Pfam; PF00005; ABC_tran; 1. 

DR ProDom; PD000006; ABC_transporter ; 1. 

DR SMART; SM00382; AAA; 1. 

DR TIGRFAMs ; TIGR01288; nodi; 1. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; AB C_T RAN S PORT ER_2 ; 1. 

KW Nodulation; Transport; Membrane; Inner membrane; ATP-binding; Plasmid. 

FT NP_BIND 69 76 ATP (By similarity) . 

SQ SEQUENCE 335 AA; 36878 MW; 8826A6330FD63CC6 CRC64; 



Query Match 2.9%; Score 367.5; DB 1; Length 335; 

Best Local Similarity 33.3%; Pred. No. 6.6e-15; 

Matches 94; Conservative 38; Mismatches 101; Indels 49; Gaps 5; 

Qy 1982 SPFEWDIVTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRV 2041 

I I I I I : I : I : : I I : | : I : I I 
Db 12 SPFEW KGDAGPSVKTL RPHAIPSVA ID LAS 41 

Qy 2042 LRGDADNDMVKIENLTKVTKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKML 2101 

: I I I : : III I I I I I I I I I I I I I I : : : I : 

Db 42 VTKSYGDKPV VDGLSFTVAAGECFGLLGPNGAGKSTITRMI 82 

Qy 2102 TGDESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWK 2161 

I = I III : : I I I I I I I I I : I :: I : I : 

Db 83 LGMTTPATGVITVLGVPVPSRARLARMGIGWPQFDNLDSEFTVRENLLVFGRYFRMSTR 142 

Qy 2162 DE7VRWKWALEKLELTKY7VDKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKA 2221 

: I : II I I 11111:1:1111 I : I I I I I I I : I I I 

Db 143 EIEAVIPSLLEFARLENKVDARVSDLSGGMKRRLTLARALINDPQLLILDEPTTGLDPHA 202 

Qy 2222 RRFLWNLILDLIKTGRSWLTSHSMEECEALCTRLAIMVNGR 2263 

I : I : I : I : : : : I I : I I I I I I I I I : : II 
Db 203 RHLIWERLRSLLARGKTILLTTHIMEEAERLCDRLCVLEAGR 244 



RESULT 14 
NDI1_RHIME 

ID NDI1_RHIME STANDARD; PRT; 355 AA. 

AC 052618; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Nod factor export ATP-binding protein I (Nodulation ATP-binding 

DE 'protein I) . 

GN NODI OR RA0472 OR SMA0864. 

OS Rhizobium meliloti ( Sinorhizobium meliloti) . 

OG Plasmid pSymA (megaplasmid 1) . 

OC Bacteria; Proteobacteria ; Alphaproteobacteria; Rhizobiales; 

OC Rhizobiaceae; Sinorhizobium/Ensif er group; Sinorhizobium. 

OX NCBI_TaxID=382; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=1021; 



RX MEDLINE-21396509; PubMed=11481432; 

RA Barnett M.J., Fisher R.F., Jones T., Komp C, Abola A. P., 

RA Barloy-Hubler F., Bowser L., Capela D. , Galibert F., Gouzy J . , 

RA Gurjal M. , Hong A. , Huizar L., Hyman R.W., Kahn D., Kahn M.L., 

RA Kalman S., Keating D.H., Palm C, Peck M.C., Surzycki R., Wells D.H., 

RA Yeh K.-C, Davis R.W., Federspiel N.A., Long S.R.; 

RT "Nucleotide sequence and predicted functions of the entire 

RT Sinorhizobium meliloti pSymA megaplasmid. " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:9883-9888(2001). 

RN [2] 

RP SEQUENCE OF 143-355 FROM N . A. 

RC STRAIN=1021; 

RA Barnett M.J., Long S.R.; 

RT "Nucleotide sequence of nodi J region of Rhizobium meliloti pSymA."; 

RL Submitted (JAN-1998) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Part of the ABC transporter complex nodlJ (TC 

CC 3. A. 1.102.1) involved in the export of LCO (lipo-chitin 

CC oligosaccharide) and a modified beta-1, 4-linked N- 

CC acetylglucosamine oligosaccharide. Responsible for energy coupling 

CC to the transport system. Therefore this complex is implicated in 

CC the nodulation induction process (By similarity) . 

CC -!- SUBUNIT: The complex is composed of two ATP-binding proteins 

CC (nodi) and two transmembrane proteins (nodJ) (Probable) . 

CC -!- SUBCELLULAR LOCATION: Inner membrane-associated (By similarity). 

CC -!- SIMILARITY: Belongs to the ABC transporter family. Nodi subfamily. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 
CC 

DR EMBL; AE007237; AAK65130.1; -. 

DR EMBL; AF043118; AAB97762.1; -. 

DR PIR; H95320; H95320. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR003439; ABC_transporter . 

DR InterPro; IPR005978; ABC_transptNodI . 

DR Pfam; PF00005; ABC_tran; 1. 

DR ProDom; PD000006; ABCjtransporter ; 1. 

DR SMART; SM00382; AAA; 1. 

DR TIGRFAMs; TIGR01288; nodi; 1. 

DR PROSITE; PS00211; ABC_TRANSP0RTER_1; 1. 

DR PROSITE; PS50893; ABC TRANS PORTE R_2 ; 1. 

KW Nodulation; Transport; Membrane; Inner membrane; ATP-binding; Plasmid; 

KW Complete proteome. 

FT NP_BIND 89 96 ATP (By similarity) . 

SQ SEQUENCE 355 AA; 39268 MW; 4DC8696D98C335DC CRC64; 

Query Match 2.9%; Score 365.5; DB 1; Length 355; 
Best Local Similarity 33.8%; Pred. No. 9.5e-15; 

Matches 95; Conservative 35; Mismatches 102; Indels 49; Gaps 4; 

Qy 1982 SPFEWDI VTRGLVAMAVEGWGFLLTIMCQYNFLRRPQRMPVSTKPVEDDVDVASERQRV 2041 



Db 


32 


Qy 


2042 


Db 


62 


Qy 


2102 


Db 


103 


Qy 


2162 


Db 


163 


Qy 


2222 


Db 


223 



32 S P FEWKDQT GLAVKT AI PG AKPTV-AIDVAS - 61 

GDADNDMVKIENLTKVYKSRKIGRILAVDRLCLGVRPGECFGLLGVNGAGKTSTFKML 2101 

•'HI : ' - I I llllllll Mill:: : | : 
VTKSYGDKPV INGLS FTVAAGECFGLLGPNGAGKSTITRMI 102 

DESTTGGEAFVNGHSVLKELLQVQQSLGYCPQCDALFDELTAREHLQLYTRLRGISWK 2161 
: M II I : : I I I I I I I I I : I : : I : I : 



DEARWKWALEKLELTKYADKPAGTYSGGNKRKLSTAIALIGYPAFIFLDEPTTGMDPKA 2221 
: I : N I M I I I I I : I : I I I I | : I I I I I | | : | | | 

EIEAVIPSLLEFARLENKADARVSDLSGGMKRRLTLARALINDPQLLILDEPTTGLDPHA 222 



I :| : I: |::::||:| ||| | || || :: , | 



RESULT 15 
YBHF_ECOLI 

ID YBHFJBCOLI STANDARD; PRT; 578 AA. 

AC P75776; Q9R7S3; Q9R7S4; 

DT 15-JUL-1998 (Rel. 36, Created) 

DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update). 

DE Hypothetical ABC transporter ATP-binding protein ybhF. 

GN YBHF OR B0794 OR SF0744 OR S0785. 

OS Escherichia coli, and 

OS Shigella flexneri. 

OC Bacteria; Proteobacteria ; Gammaproteobacteria ; Enterobacteriales ; 

OC Enterobacteriaceae ; Escherichia . 

OX NCBI_TaxID=562, 623; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC SPECIES=E.coli; STRAIN=K12 / MG1655; 

RX MEDLINE=97426617; PubMed=9278503 ; 

RA Blattner F.R., Plunkett G. Ill, Bloch C.A., Perna N.T., Burland V., 

RA Riley M. , Collado-Vides J., Glasner J.D., Rode C.K., Mayhew G.F., 

RA Gregor J., Davis N.W., Kirkpatrick H.A., Goeden M.A., Rose D.J., 

RA Mau B. , Shao Y. ; 

RT "The complete genome sequence of Escherichia coli K-12."; 

RL Science 277:1453-1474 (1997) . 

RN [2] 

RP SEQUENCE FROM N.A. 

RC SPECIES=E. coli; STRAIN=K12; 

RX MEDLINE=97061202; PubMed=8905232 ; 

RA Oshima T . , Aiba H., Baba T., Fujita K. , Hayashi K., Honjo A., 

RA Ikemoto K. , Inada T . , Itoh T., Kajihara M. , Kanai K. , Kashimoto K., 

RA Kimura S., Kitagawa M. , Makino K. , Masuda S., Miki T . , Mizobuchi K., 

RA Mori H., Motomura K. , Nakamura Y., Nashimoto H., Nishio Y., Saito N. , 

RA Sampei G. , Seki Y., Tagami H., Takemoto K., Wada C, Yamamoto Y., 

RA Yano M. , Horiuchi T.; 

RT "A 718-kb DNA sequence of the Escherichia coli K-12 genome 

RT corresponding to the 12.7-28.0 min region on the linkage map."; 

RL DNA Res. 3:137-155(1996). 

RN [3] 



SEQUENCE FROM N . A. 

SPECIES=S. flexneri; STRAIN=301 / Serotype 2a; 
MEDLINE=22272406; PubMed=12384590 ; 

Jin Q. , Yuan Z., Xu J., Wang Y., Shen Y. , Lu W. , Wang J., Liu H., 
Yang J., Yang F. , Zhang X., Zhang J. , Yang G., Wu H., Qu D., Dong J. , 
Sun L., Xue Y., Zhao A., Gao Y., Zhu J., Kan B., Ding K., Chen S., 
Cheng H . , Yao Z., He B., Chen R. , Ma D., Qiang B., Wen Y. , Hou Y., 
Yu J. ; 

"Genome sequence of Shigella flexneri 2a: insights into pathogenicity 
through comparison with genomes of Escherichia coli K12 and 0157."; 
Nucleic Acids Res. 30:4432-4441(2002). 
[4] 

SEQUENCE FROM N.A. 

SPECIES-S. flexneri; STRAIN-2457T / ATCC 700930 / Serotype 2a; 
MEDLINE=22590274; PubMed=12704152 ; 

Wei J., Goldberg M.B., Burland V., Venkatesan M.M., Deng W., 
Fournier G. f Mayhew G.F., Plunkett G. Ill, Rose D.J., Darling A., 
Mau B., Perna N.T., Payne S.M., Runyen- Janecky L.J., Zhou S., 
Schwartz D.C., Blattner F.R.; 

"Complete genome sequence and comparative genomics of Shigella 
flexneri serotype 2a strain 2457T. 11 ; 
Infect. Immun. 71:2775-2786(2003). 

-!- SIMILARITY: Belongs to the ABC transporter family. 
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EMBL; AE000181; AAC73881.1; ALT_INIT. 
EMBL; D90716; BAA35454.1; ALT_INIT. 
EMBL; D90717; BAA35460.1; ALT_INIT. 
EMBL; AE015103; AAN42379.1; ALT_INIT. 
EMBL; AE016980; AAP16256.1; ALT_INIT. 
EcoGene; EG13314; ybhF. 
InterPro; IPR003593; AAA_ATPase. 
InterPro; IPR003439; ABC_transporter . 
Pfam; PF00005; ABC_tran; 2. 
ProDom; PD000006; ABC_transporter ; 2. 
SMART; ,SM00382; AAA; 2. 

PROSITE; PS00211; ABC_TRANSP0RTER_1 ; 1. 
PROSITE; PS50893; ABC_T RAN S PORT ER_2 ; 2. 
Hypothetical protein; ATP-binding; Transport; Repeat; 
Complete proteome. 



DOMAIN 


1 


237 


ABC TRANSPORTER 1. 


DOMAIN 


330 


559 


ABC TRANSPORTER 2. 


NP_BIND 


40 


47 


ATP (POTENTIAL) . 


NP_BIND 


362 


369 


ATP (POTENTIAL) . 


CONFLICT 


44 


44 


A -> E (IN REF. 4) . 


SEQUENCE 


578 AA; 


63132 


MW; DB3B3FA2134 90F3C 



Query Match 2.8%; Score 354; DB 1; Length 578; 

Best Local Similarity 15.1%; Pred. No. l.le-13; 

Matches 194; Conservative 112; Mismatches 236; Indels 744; 



Gaps 18; 



990 WCVDKLTKVYKDDKKLALNKLSLNLYENQWSFLGHNGAGKTTTMSILTGLFPPTSGSA 1049 
I: :: I I : I |: I :: I : I : I I I I I I I : I I I Mill 

5 VITLNGLEKRFPGMDKPAVAPLDCTIHAGYVTGLVGPDGAGKTTLMRMLAGLLKPDSGSA 64 

1050 TIYGHDIRTEMDEIRKNLGMCPQHNVLFDRLTVEEHLWFYSRLKSMAQEEIRREMDKMIE 1109 
I: I I : II II I : : I I I I : I I : | : | : | : : : : : | 

65 TVIGFDPIKNDGAXHAVLGYMPQKFGLYEDLTVMENLNLYADLRSVTGEARKQTFARLLE 124 

1110 DLELSNKRHSLVQTLSGGMKRKLSVAIAFVGGSRAIILDEPTAGVDPYARRAIWDLILKY 1169 
I I 111111:11 :| II : : : I M I I I I I : M : I : : : 

125 FTSLGPFTGRLAGKLSGGMKQKLGLACTLVGEPKVLLLDEPGVGVDPISRRELWQMVHEL 184 

1170 K- PGRT I LLSTHHMDEADLLGDRIAI I SHGKLKCCGS PLFLKGT YGDGYRLTLVKRPAEP 122 8 
I I I I I : : I I I : I : : : : I : I I | 
185 AGEGMLILWSTSYLDEAEQCRD-VLLMNEGELLYQGEP 221 

1229 GGPQEPGLASSPPGRAPLSSCSELQVSQFIRKHVASCLLVSDTSTELSYILPSEAAKKGA 128 8 
222 - 221 

1289 FERLFQHLERSLDALHLSSFGLMDTTLEEVFLKVSEEDQSLENSE7VDVKESRKDVLPGAE 134 8 
222 221 

1349 GPASGEGHAGNLARCSELTQSQASLQSASSVGSARGDEGAGYTDVYGDYRPLFDNPQDPD 14 08 

II I : I I : I : I 
222 KALTQTMA GRSF LMTSPH 239 

1409 NVSLQEVEAEALSRVGQGSRKLDGGWLKVRQFHGLLVKRFHCARRNSKALFSQILLPAFF 14 68 

: I : I I I 

24 0 EGNRKL 245 

1469 VCVAMTVALSVPEIGDLPPLVLSPSQYHNYTQPRGNFIPYANEERREYRLRLSPDASPQQ 1528 
: I I : I : : I : : I I I : I : I 

246 LQRALKLPQVSD GMIQGKSVRLILKKEATPDD 277 

1529 LVSTFRLPSGVGATCVLKSPANGSLGPTLNLSSGESRLLAARFFDSMCLESFTQGLPLSN 1588 
: III 
278 I RHADGM 2 84 

1589 FVPPPPSPAPSDSPASPDEDLQAWNVSLPPTAGPEMWTSAPSLPRLVREPVRCTCSAQGT 164 8 

II: 

285 PEI 287 

1649 GFSCPSSVGGHPPQMRWTGDILTDITGHNVSEYLLFTSDRFRLHRYGAITFGNVLKSIP 17 08 

I : : I I : II 

288 NINE TTPRFE 297 

1709 ASFGTRAPPMVRKIAVRRAAQVFYNNKGYHSMPTYLNSLNNAILRANLPKSKGNP7U\YGI 1768 

::: | | 

298 DAFIDLLGGA 307 



1769 TVTNHPMNKTSASLSLDYLLQGTDWIAIFIIVAMSFVPASFWFLVAEKSTKAKHLQFV 1828 



Qy 1829 SGCNPIIYWLANYVWDMLNYLVPATCCVIILFVFDLPAYTSPTNFPAVLSLFLLYGWSIT 1888 

I I : 

Db 308 GTSES 312 

Qy 1889 PIMYPASFWFEVPSSAYVFLIVINLFIGITATVATFLLQLFEHDKDLKWNSYLKSCFLI 1948 

I : 

Db 313 PL 314 

Qy 1949 FPNYNLGHGLMEMAYNEYINEYYAKIGQFDKMKSPFEWDIVTRGLVAMAVEGWGFLLTI 2008 

I : III I 

Db 315 GAILHTVEGTPG 326 



Qy 2009 MCQYNFLRRPQRMPVSTKPVEDDVDVASERQRVLRGDADNDMVKIENLTKVYKSRKIGRI 2068 

: : : : I I I II 

Db 327 ETVI EAKELTK KFGDF 342 

Qy 2069 LAVDRLCLGVRPGECFGLLGVNGAGKTSTFKMLTGDESTTGGEAFVNGHSVLKELLQVQQ 2128 

II: I : I I I I I I I I I I I I : : I I I I : I hi I : : : : I 

Db 343 AAT DHVN FAVKRGE I FGLLGPN GAGKS TT FKMMC GLLVP T S GQALVLGMDLKE S S GKARQ 402 

Qy 2129 SLGYCPQCDALFDELTAREHLQLYTRLRGISWKDEARVVKWALEKLELTKYADKPAGTYS 2188 

Ml I :|: II , ::|: :: : |: : : : | | | 

Db 4 03 HLGYMAQKFSLYGNLTVEQNLRFFSGVYGLRGRAQNEKISRMSEAFGLKSIASHATDELP 4 62 

Qy 2189 GGNKRKLSTAI ALI GYPAFI FLDEPTTGMDPKARRFLWNLI LDLI KTGRSWLTSHSMEE 2248 

I I : : I : I : I : I : I I I I I I : I : I I II I I : : : I : I : : I : I I : I 
Db 4 63 LGFKQRIAIACSLMHEPDILFLDEPTSGVDPLTRREFWLHINS1WEKGVTVMVTTHFMDE 522 

Qy 2249 CEALCTRLAIMVNGRLRCLGSIQHLK 2274 

I II::: I : I I : II 
Db 523 AE- YCDRI GLVYRGKLIASGTPDDLK 547 



Search completed: September 1, 2004, 10:53:18 
Job time : 54 sees 



